PowerShell Basics #1: Reading and parsing CSV

I will be giving a talk on the topic of “PowerShell for Developers” at TechDays 2010 in Helsinki, Finland. As a warm-up to my presentation, I will be publishing a series of blog posts on various aspects of PowerShell. My viewpoint is specifically one of developer utility: What can PowerShell do to make a dev’s life easier?

I want to start with something that touches on data. Often, developers receive data in Excel format – usually in order to then import it into a database. Reading data from Excel is somewhat painful, but fortunately, Excel allows for easy saving to the CSV format. PowerShell, on the other hand, provides for several quite easy manipulations.

Simple imports with Import-Csv

Let’s start with a simple CSV file, customers.csv:

ID,Name,Country
1,John,United States
2,Beatrice,Germany
3,Jouni,Finland
4,Marcel,France

Turning this into objects in PowerShell is very straightforward:

PS D:\temp> import-csv customers.csv 

ID Name     Country
-- ----     -------
1  John     United States
2  Beatrice Germany
3  Jouni    Finland
4  Marcel   France

As you can see, the first line in the text file gets parsed as a header, specifying the names on the PowerShell objects. The import doesn’t have a notion of strong typing; therefore, all the properties are imported as pure text. Often this is enough, but if it isn’t, look below…

Headerlessness and other cultures

There are a few scenarios where this won’t work. For example, if your CSV doesn’t have headers, you would get objects with column names such as “1”, “John” and “United States”. Lacking headers, you can supply them as a parameter:

import-csv .\customers.csv -header ID,Name,Country

That was easy (but don’t do it when your data has headers, or you end up duplicating them).

Well then, perhaps you live in a region where the field separator isn’t the usual comma? This is no problem to PowerShell, either

PS D:\temp> type customers-fi.csv
ID;Name;Country
1;John;United States
2;Beatrice;Germany
PS D:\temp> import-csv .\customers-fi.csv -delimiter ';'

ID Name     Country
-- ----     -------
1  John     United States
2  Beatrice Germany

If you know the file originated from your current UI culture, you can just dispense with the delimiter specification and type import-csv –useCulture customers-fi.csv. That will pick up the delimiter from your Windows settings.

When your CSV ain’t a file…

Often you get your CSV data in a file. Occasionally, you might download it through HTTP, or even pull it from a database. No matter how, you may end up with an array of strings that contains your CSV data. The Import-Csv cmdlet reads a file, but if you need to parse the data from another source, use ConvertFrom-Csv.

PS D:\temp> $csv = 'ID,Name,Country
>> 1,John,United States
>> 2,Beatrice,Germany'
>>
PS D:\temp> ConvertFrom-Csv $csv

ID Name     Country
-- ----     -------
1  John     United States
2  Beatrice Germany

As far as the culture switches go, everything discussed above also applies to ConvertFrom-Csv.

How about CSV content oddities?

There are some uses of CSV that veer away from the normal, safe path. The first and a reasonably common scenario is having the field delimiter in the data, something that is usually handled by quoting the field. Of course, up next is the scenario where a field contains the quotation mark.

imageAnd finally, there is the really controversial aspect of having a newline in a CSV field. Many parsers struggle with this, and in fact, the correct behavior isn’t exactly clear. Of course, for practical purposes, anything that makes Excel exports work correctly is usually good. But let’s look at an example that contains all of these anomalies (the original content for this is shown in the Excel screenshot to the right).

Number,String,Multiline,Date
512,"Comma,Rocks","Line 1
Line 2",15.1.2010 15:14
57,"Cool ""quotes""","First
Second",7.1.2010 9:33

The data is split across five lines, but actually contains two records and a header. This alone is somewhat controversial, given CSV’s starting point of one line per record. Anyway, it often still needs to be parsed, and PowerShell does a good job here:

PS D:\temp> Import-Csv .\oddstrings.csv

Number String        Multiline     Date
------ ------        ---------     ----
512    Comma,Rocks   Line 1...     15.1.2010 15:14
57     Cool "quotes" First...      7.1.2010 9:33

There is one key thing to notice. Import-Csv works, because it treats the data source as a single whole. However, ConvertFrom-Csv misparses the multiline fields, as it handles the input line-by-line.

PS D:\temp> type .\oddstrings.csv | ConvertFrom-Csv

Number  String          Multiline Date
------  ------          --------- ----
512     Comma,Rocks     Line 1
Line 2" 15.1.2010 15:14
57      Cool "quotes"   First
Second" 7.1.2010 9:33

Strong typing, then?

For this, there are no pre-cooked solutions. But once your data is imported correctly, separators, multiline fields and all, it’s rather easy to just typecast the stuff, providing you input validation at the same time. Consider this CSV file of event descriptions:

StartsAt,Title,Venue
9.3.2010,TechDays 2010,Helsinki
15.2.2010,Mobile World Congress,Barcelona

Next, you want to filter the data to just show the occurring within the next 30 days. For this, you'll want the datetimes parsed into System.DateTime objects.

PS D:\temp> $events = Import-Csv event.csv | foreach {
  New-Object PSObject -prop @{
    StartsAt = [DateTime]::Parse($_.StartsAt);
    Title = $_.Title;
    Venue = $_.Venue
  }
}
PS D:\temp> $events                            

StartsAt          Venue     Title
--------          -----     -----
9.3.2010 0:00:00  Helsinki  TechDays 2010
15.2.2010 0:00:00 Barcelona Mobile World Congress

Now, filtering the list is a snap.

PS D:\temp> $events | where { $_.StartsAt -lt (get-date).AddDays(30) }

StartsAt          Venue     Title
--------          -----     -----
15.2.2010 0:00:00 Barcelona Mobile World Congress

One more thing to notice here: In the example above, it worked because the date format (“15.2.2010”) was in Finnish and I happened to run the PowerShell in a thread with the Finnish culture. However, if your data happens to come from a culture different than your UI, you need to pass in the correct culture specifier. For example, to parse a date in the US locale, use the following:

$usCulture = [System.Globalization.CultureInfo]::CreateSpecificCulture("en-US")
[DateTime]::Parse("04/07/2009", $usCulture)

Note that specifying the culture explicitly is always a good idea if you plan on saving the script and reusing it later. Although you might pay attention to the parsing details at the time of writing, it is quite conceivable for someone to run the script later on in another thread. As dates are prone to silent misparsing (for example, is 04/07/2009 7th April or 4th July?), you could end up manipulating incorrect data.

In addition to dates, you’ll want to look at cultures when parsing decimal numbers and such. Remember: the –UseCulture switch on Import-Csv only applies to the separator settings, not the optional parsing phase.

Enjoy!

January 22, 2010 · Jouni Heikniemi · 34 Comments
Tags: ,  · Posted in: .NET

34 Responses

  1. WebApe - April 22, 2010

    Well done…saved me a lot of time re headerless csv.

  2. Scripting STSADM with PowerShell | SharePoint Blues - April 28, 2010

    […] If you have no idea what I’m talking about, there are plenty of resources on TechNet, and some great tutorials by our CTO, Jouni […]

  3. Stef - August 26, 2010

    This has been the most helpful thing I've found on import-csv including two leading books on Powershell. Thank you for addressing what the results will look like and how to address/manipulate them!

  4. pradeep - July 8, 2011

    How to import .csv file for different culture in vb.net? In usa culture value in excel sheet is 10.0 but in dutch its displaying 10,0. Please give me the code..

  5. TJ - April 27, 2012

    In your example:

    StartsAt Venue Title
    ——– —– —–
    9.3.2010 0:00:00 Helsinki TechDays 2010
    15.2.2010 0:00:00 Barcelona Mobile World Congress

    How do you make StartsAt, Venue, and Title variables and call them in a script?

    The script will have loop until the last line is read.

  6. Jouni Heikniemi - May 2, 2012

    @pradeep: CSV import in VB.NET is an entirely different thing to do. Find out some VB.NET import tutorials and you'll be on your way. Typically, you'd use a CSV import library and set the separator to semicolon (typically used as a field separator in locales where comma is the decimal separator).

    @TJ: I'm not sure I understand what you mean. The example in this post constructs event objects which have these fields parsed as variables. Can you be more specific?

  7. maksemuz - March 6, 2013

    Thanks a lot!
    Your guide is more useful then the technet.microsoft document about import CSV using powershell.
    \m/

  8. rahul - March 29, 2013

    Thanks alot!
    I have one problem, in my csv file headers are in the form of 10-05-08-11 which contains numbers as data, due to which the above mentioned methods are not working. Could you please tell me how can i read the same?

    Ex:
    10-05-08-11,10-08-07-12,10-15-08-99 (Headers)
    2,55,100 (data)
    55,45,88 (data)

    I want to read above data. How can i do it?

    TIA

  9. Jouni Heikniemi - April 22, 2013

    @Rahul, the CSV you mentioned parses just fine (copy that content into a file and do a import-csv foo.csv). Without you specifying what you mean by "not working", I'm not sure I can help.

  10. Importing from CSV and sorting by Date - Just just easy answers - September 6, 2013

    […] http://www.heikniemi.net/hardcoded/2010/01/powershell-basics-1-reading-and-parsing-csv/ […]

  11. santosh - October 24, 2013

    Hi,

    I want to know how to apply background color and other formatting aspects to the header column to a csv file using powershell. Can you please help me with some code sample.

    Thanks & Regards,
    Santosh Kumar Patro

  12. lestriches.fr - August 10, 2016

    Ꮋello it'ѕ me, I am alsⲟ visiting thіs web page regularly, tҺis website is іn fɑct pleasant
    and the viewers ɑre in fact sharing pleasant tɦoughts.

  13. Matt - August 15, 2016

    Hi,

    import-csv does not appear to create a "real array" from the csv, but an "array like powershell object". Is there any way to make a real array, so that i can (for example) count the number of elements as you would with an array like this:

    $array = import-csv .\test.csv
    $array.count

    There is no .count (or any other standard array type options / functions) to the object you receive when using import-csv.

    Thanks !

  14. psn code free 2016.com - August 24, 2016

    At present, you can uncover the Harvest Moon series on PSP, Playstation two, Nintendo DS,
    and Wii.

  15. download code 3ds - August 25, 2016

    The 3DS affords a slider on the highest screen that lets you adjust the intensity of the 3D impact.

  16. Test - September 9, 2016

    I do agree with all of the ideas you have presented to your post.

    They are really convincing and will definitely work.
    Nonetheless, the posts are too brief for beginners.
    May you please extend them a little from next time?
    Thanks for the post.

  17. sildigra soft - September 14, 2016

    What's up to ever , because I am actually keen oof reading this weblog's post to be updated regularly.
    It incpudes pleasant stuff.

  18. custompulloverhoodie.com - September 15, 2016

    Awesome post.

  19. Rock 'n' roll in my blood internet radio station and website.We are streaming 24/7 rock music we include all rock subgenres - September 16, 2016

    you are truly a good webmaster. The site loading speed is incredible.
    It seems that you are doing any distinctive trick.
    In addition, The contents are masterpiece. you have performed a great task in this
    matter!

  20. imitation jewellery kundan india - October 12, 2016

    Great article.

  21. pixel gun 3d hack 2016 - November 22, 2016

    good information you got at this point what are everyones
    first thoughts on here web page about

  22. avakin life online hack - November 22, 2016

    very good intel we have at this time what are you're first
    thoughts with our web page about

  23. buy targeted traffic that converts - December 3, 2016

    Have you ever considered about including a little bit
    more than just your articles? I mean, what you say is important and all.
    But just imagine if you added some great photos or videos to give your
    posts more, "pop"! Your content is excellent but with images
    and clips, this site could undeniably be one of the
    most beneficial in its niche. Superb blog!

  24. Luciano Bernarducci - April 29, 2017

    Thank you,

  25. learn numbers and colors in spanish - May 7, 2017

    For additional spelling practice, have students trade their coloring web page
    or artwork with a companion and neatly label the colors
    they see.

  26. thế giới điện lạnh - July 29, 2017

    With havin so much content do you ever run into any issues of plagorism or copyright violation? My website has a
    lot of unique content I've either written myself or outsourced
    but it seems a lot of it is popping it up all over the internet without my authorization. Do you know any methods
    to help stop content from being stolen? I'd certainly
    appreciate it.

  27. Black Diamond Casino - August 9, 2017

    My brother recommended I may like this web site. He was entirely right.
    This publish actually made my day. You cann't believe simply how so much time
    I had spent for this information! Thank you!

  28. JuanitaX - August 9, 2017

    Hello blogger i see you don't monetize your site. You can earn additional
    $$$ easily, search on youtube for: how to earn selling articles

  29. Exclusive - September 5, 2017

    I am currently converting xlsx to csv format using the PSModule. One things that is an issue is I am getting 6/30/2016 0:00 but the excel has 6/30/2016. But when I manually save the excel format to csv, it saved properly with 6/30/2016.

    There should be a way to format the column so that it does not append 0:00

    Thanks

  30. MartaBiggie - July 8, 2018

    Hello admin, i've been reading your content for some
    time and I really like coming back here. I can see that you probably don't make money on your site.
    I know one interesting method of earning money, I think
    you will like it. Search google for: dracko's tricks

  31. Emilio - October 18, 2018

    Great blog here! Also your web site rather a lot
    up very fast! What host are you the use of? Can I get your affiliate hyperlink
    in your host? I want my web site loaded up as fast as yours lol

  32. Laverne - April 4, 2019

    Yahoo's finest hope foor that was Flickr.

  33. BestTerri - July 16, 2019

    I see you don't monetize heikniemi.net, don't waste your traffic, you
    can earn extra cash every month with new monetization method.
    This is the best adsense alternative for any type of website
    (they approve all sites), for more info simply search in gooogle: murgrabia's tools

  34. warez movies - October 13, 2019

    Thanks for finally writing about >Heikniemi Hardcoded > PowerShell Basics #1: Reading and parsing CSV <Liked it!

Leave a Reply