|   Home   |   Back   | 

Reading Calendar Dates into Mathematica from CSV files

(C) Weber & Partner, 2000

Download the notebook

Reading simple CSV files

Comma separated data or CSV files are still the most common way to export data from trading systems and other applications in finance for analysis and prototyping. The import of basic CSV data into Mathematica is simple and fast with the ReadList command together with setting some options for ReadList like WordSeparators and RecordLists. In almost any case it is usefull to use Word as the data type in ReadList and let Mathematica do the conversion to numbers.

     The first step is to make the right directory available for Mathematica either through using SetDirectory or adjusting the global variable $Path.

[Graphics:Images/index_gr_1.gif]

     Using  !! allows to have a quick look at the data

[Graphics:Images/index_gr_2.gif]
Caplet, 15.25, 6.35
Caplet, 17.25, 6.54
Caplet, 17.25, 6.83
Caplet, 17.50, 7.11
Caplet, 18.00, 7.33

     Now the data is read in using Word as the data type and then the data is converted to numbers and symbols using ToExpression.

[Graphics:Images/index_gr_3.gif]
[Graphics:Images/index_gr_4.gif]

Reading calendar dates into Mathematica

In Mathematica a date is usally represented as a list of three integers like the first three integers of the build in function Date. This format is also used in the few calendar functions which ship with Mathematica (see the standard package Miscellaneous`Calendar`)

[Graphics:Images/index_gr_5.gif]
[Graphics:Images/index_gr_6.gif]

The above approach to read in data is not sufficient to make calendar dates available within Mathematica. To achieve this we have to tell Mathematica how a calendar date looks like in the data which will be read.

     A quick look at the data reveals the structure for the calendar dates

[Graphics:Images/index_gr_7.gif]
21/03/1995,21/06/1995,Caplet,15.25,6.35
21/06/1995,21/09/1995,Caplet,17.25,6.54
21/09/1995,21/12/1995,Caplet,17.25,6.83
21/12/1995,21/03/1996,Caplet,17.50,7.11
21/03/1996,21/06/1996,Caplet,18.00,7.33

The parts of the calendar dates are separated through slashes. We assume that each calendar date in the whole data set has the same structure. Based on this assumption we define a simple conversion function. In this case it is only necessary to do some replacements within the string and reverse it after it is converted to a list of three integers.

     Definition of the conversion function

[Graphics:Images/index_gr_8.gif]

This function is only applied if the string which is passed to the function maches the date template. To simplify the use of this function a second definition for all other cases is provided and the function is made Listable, so that it will be applied to each element of a list.

     Improving its usability

[Graphics:Images/index_gr_9.gif]

Based on these definitions one can proceed as before.

     Applying it to the data

[Graphics:Images/index_gr_10.gif]
[Graphics:Images/index_gr_11.gif]

We found this to be very handy and fast for reading in CSV files with calendar dates.

Some additional ideas

Practical experience shows that the above function is used very often. On Microsoft Windows the calendar format is part of the system setting, so the calendar dates in  CSV files which are generated on Microsoft Windows machine look very often the same. Therefore it makes sense to put the above function into the init.m file which is read from Mathematica at start up. In this case it might even make sense to go one step further. The above definition is nothing more then a special case of the ToExpression command, so we extend this Mathematica command accordingly.

     $DateTemplate is a definition which should be placed in the init.m file. It is always helpful to provide some help information.

[Graphics:Images/index_gr_12.gif]

     The function ToExpression can be extended after an unprotect of the symbol

[Graphics:Images/index_gr_13.gif]

     As mentioned before it is always helpful to provide some help information. In this case the build in help information is also extended.

[Graphics:Images/index_gr_14.gif]

     Our system setting is such that we have WordSeparators->"," as the default and we almost always have a record per row. So we change the default setting of ReadList in the init.m file.

[Graphics:Images/index_gr_15.gif]

     Now reading CSV data with calendar dates into Mathematica is really simple.

[Graphics:Images/index_gr_16.gif]
[Graphics:Images/index_gr_17.gif]

The back side of this extension to Mathematica is, that you get used to them and take them for granted. Currently they will break if your system setting is changing or you update Mathematica.

     To make yourself aware of the changes it is helpful to include another line of code in your init.m file which informs you about your changes.

[Graphics:Images/index_gr_18.gif]
[Graphics:Images/index_gr_19.gif]

     Information on your changes are always available.

[Graphics:Images/index_gr_20.gif]
[Graphics:Images/index_gr_21.gif]

    The ??Operator even shows your extended definition of ToExpression.

[Graphics:Images/index_gr_22.gif]
[Graphics:Images/index_gr_23.gif]
[Graphics:Images/index_gr_24.gif]
[Graphics:Images/index_gr_25.gif]


Converted by Mathematica      January 8, 2001