[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]



#!/usr/bin/perl -w

open TXT, "< datafile" or die "Can't open datafile : $!";
while(<TXT>) {
         if ( m/^[ ]*\d+[        ]+/ ) {
             print "$_\n" or die "print failed: $!";
         } elsif ( m/^[ ]*"ID"[  ]+/ ) {
             # do nothing - don't print this line
         } else {
             print ">>missing ID on this line:  $_";
         }
}

Note: the above code includes an invisible tab in each of the [	]+ character
classes.  There were no tabs in the email you posted, so I chose to compensate
in the code.  I did not understand your regex - see comment below.

For the "else" you could choose to not print the line or ?
If you want to actually parse the line to test other fields for being empty,
I've seen some scripts online
that use ParseWords.  I could not find any reference to a perl command that
assigns tokens into positional parameters (ala $1 $2 $3 etc.) in the way
that set does in the Korn shell.



>I'm trying to parse a CSV file in perl and I'm having a issue with some
>of the columns being blank.
>
>Here is a sample piece of data.
>
>Id    LASTNAME    FIRSTNAME
>       Adams       Portia
>10572 Alexander   Robert
>
>You can see that the first row does not have an ID.  This can be true
>for all columns.  They may or may not have values. 
>
>Here is how I'm trying the parse:
>
>open TXT, "< Expanded_2005_Select_1.csv";
>while(<TXT>) {
>         m/^(\d+?)\t/;

OK, in this regex, I don't believe the parens are necessary (is this 
Perl 5 or 6?)
and the \d+? is not as clear as \d*.  Did you mean to put "(\d+)?" ?
Not sure what you were thinking here.

>         print "$1\n";

Perl on my Mac OS X barfed on the print statement.

>}
>
>Each columns is tab delimeted.  When I run this I get the lastname in $1
>for the first line and the the ID in $1 for the second line.  I need to
>somehow create a regex that would be unforgiving of nothing being there.
>
>Data file looks like this:
>       1 "ID"    "LASTNAME"      "FIRSTNAME"     "TITLE" "COMPANY"
>"ADDRESS        "       "ADDRESS2"      "CITY"  "STATE" "ZIPCODE"
>"COUNTRY"               "PHONE" "EMAIL" "REGTYPE"       "DATE"  "TIME"
>"Question1"     "Questio        n2"     "Question3"     "READERID"
>       2         "Adams" "Portia"        "Director"      "The Rockefeller
>Univers        ity"    "1230 York Ave "                "New York"
>"NY"    "10021-6        399"    "USA"   2123277719
>"adams at rockefeller.edu" "Member"               
>       3 10572   "Alexander"     "Robert"        "Manager Voice & Video
>Solution"                "Air Products and Chemicals, Inc"       "7201
>Hamilton Blvd"                    "Allentown"     "PA"    "18195-1501"
>"USA"   "610-481-7156"          "alexanrw at airproducts.com"      "Member"
>06/12/2005      06:06:14         pm                             60711
>
>The 1,2,3 that you see is the line numbers in VI
>
>
>_______________________________________________
>Ale mailing list
>Ale at ale.org
&gt;<a  rel="nofollow" href="http://www.ale.org/mailman/listinfo/ale";>http://www.ale.org/mailman/listinfo/ale</a>

_______________________________________________
Ale mailing list
Ale at ale.org
<a  rel="nofollow" href="http://www.ale.org/mailman/listinfo/ale";>http://www.ale.org/mailman/listinfo/ale</a>




</pre>
<!--X-Body-of-Message-End-->
<!--X-MsgBody-End-->
<!--X-Follow-Ups-->
<hr>
<!--X-Follow-Ups-End-->
<!--X-References-->
<!--X-References-End-->
<!--X-BotPNI-->
<ul>
<li>Prev by Date:
<strong><a href="msg00109.html">[ale] compiler error re:Open Office</a></strong>
</li>
<li>Next by Date:
<strong><a href="msg00106.html">[ale] Disabling Cache?</a></strong>
</li>
<li>Previous by thread:
<strong><a href="msg00095.html">[ale] Parsing CSV file in perl</a></strong>
</li>
<li>Next by thread:
<strong><a href="msg00079.html">[ale] link to an excellent review of SUSE 9.3 Professional</a></strong>
</li>
<li>Index(es):
<ul>
<li><a href="maillist.html#00097"><strong>Date</strong></a></li>
<li><a href="threads.html#00097"><strong>Thread</strong></a></li>
</ul>
</li>
</ul>

<!--X-BotPNI-End-->
<!--X-User-Footer-->
<!--X-User-Footer-End-->
</body>
</html>