As well as writing programs that perform calculations and write the output data to a .csv file, it is often required to read the data contained in a .csv file. Often this is when the data has been created by a separate program or has been downloaded from the Internet. You may then have to write a program to process that data.
In this example, we will write a program to process some fictitious data. This fictitious data is student grades. The data (in a .csv file) consists of many rows, each representing a student. In each row, there is a column for the surname, student ID number and then four columns representing the grades (on a 0 to 100 scale) for 4 separate tests. The data has been created using Mockaroo.
MOCK_DATA.csv
Cohalan,771298484,67,51,74,51 Vallack,615278957,45,57,79,49 Bungey,319585891,42,37,57,86 Balden,019478797,80,90,79,26 Probet,701116170,58,35,76,98 Baigent,998970542,86,61,63,65 Quarton,971939596,58,40,100,91 Witt,054978458,59,99,72,46 Burnand,283808017,59,92,51,11 Rawsthorn,814535776,63,50,62,100 Vasyatkin,925395193,84,73,85,57 Crossby,427723213,33,56,100,100 Hanham,597141409,74,30,68,60 Scorthorne,720102259,58,33,100,67 Evans,473955894,93,96,92,89 O'Bee,058336708,44,11,66,44 Lewsam,623498290,60,58,72,74 Skim,412729216,70,51,88,84 Feehely,073095508,49,50,82,100 Brandassi,027384415,77,59,93,51 Umbert,058510748,73,100,78,58 Lygo,140824408,55,75,78,85 Schulkins,200434642,54,52,65,37 Scanlon,962865222,39,49,55,85 Allridge,115735549,75,52,70,27 Batchan,333316640,71,39,94,54 Purveys,583193483,46,74,59,66 Shemilt,374260199,58,88,85,32 Conant,640666260,39,51,89,64 Botright,292406035,77,27,82,95 Schulkins,152835084,60,22,70,45 Domke,624409539,61,71,48,51 Elies,634033768,77,100,56,100 Larham,145602942,63,32,81,29 Vacher,498955516,72,14,73,21 Di Filippo,711731193,65,65,89,64 Blench,496815213,40,24,73,79 Goodread,010007942,50,42,66,56 Minchin,988677183,94,14,69,8 Orrock,509070758,69,54,85,69 Wyles,154625310,49,25,86,48 Tookill,193475472,23,62,53,28 Wimmer,853474046,45,37,83,97 Spira,979696276,14,54,93,61 Felix,875835071,63,2,71,81 Reckhouse,148144410,71,88,77,59 Le Breton De La Vieuville,558486912,47,41,81,57 Horrigan,812230116,45,52,80,64 Le Port,014113685,67,34,91,83 Spitell,043792447,40,88,83,79 Dils,538529529,84,88,55,48 Goodge,602790717,100,50,67,39 Vern,235405979,95,51,62,40 Death,436211337,62,2,93,20 Giacomello,790772359,57,18,80,68 McAvey,843110330,61,17,53,74 Sevitt,711438118,61,40,69,76 Utridge,013821354,69,51,74,56 Calleja,304154084,82,69,72,74 Barenskie,545937322,57,72,70,75 Beckingham,380600744,61,82,61,33 Coggen,520808387,63,0,100,61 Christene,005768811,74,38,100,53 Hodgen,248294860,81,18,73,87 Bisco,203249260,43,45,54,68 Woodruff,721092637,65,68,76,90 Binham,298643920,73,60,76,49 Ivasechko,199652751,31,16,78,92 Gecke,019609957,57,29,53,34 Izakof,347786735,57,14,48,9 Pringour,210495355,71,11,100,74 Burgum,066120460,32,67,93,76 Thomel,827027608,73,54,52,42 Doldon,700489818,69,73,71,58 Benyon,026302393,76,86,89,80 Norvel,010489034,45,52,68,65 Mullard,111450702,85,56,92,63 Phelp,395748563,62,60,57,84 Androli,746030914,55,64,100,71 Vinall,758604004,86,60,60,27
The code to read in the .csv file is shown below.
main.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
#include <fstream> #include <iostream> #include <string> // struct to hold student data struct Student { std::string surname; int sid; // student ID number int test1; // test 1 grade (0 to 100) int test2; // test 2 grade (0 to 100) int test3; // test 3 grade (0 to 100) int test4; // test 4 grade (0 to 100) double average; // average of all tests (equal weight) }; // function prototypes int count_lines(); void read_into_array(Student array[], int n); void print_array(const Student array[], int n); int main() { // count the number of lines in the CSV int n = count_lines(); std::cout << n << " lines read." << std::endl; // then create dynamic array of correct size Student *students = new Student[n]; // read the CSV file into the array read_into_array(students, n); // and then print for debug purposes print_array(students, n); } int count_lines() { // create an input file stream std::ifstream input; // use it to open a file named 'MOCK_DATA.csv' input.open("MOCK_DATA.csv"); // check if the file is not open if (!input.is_open()) { // print error message and quit if a problem occurred std::cerr << "Error! No input file found!\n"; exit(1); } int n = 0; std::string dummy; // keep reading lines in file until no lines left to read // read into dummy string and increment count while (getline(input, dummy)) { n++; } return n; } void read_into_array(Student array[], int n) { // create an input file stream std::ifstream input; // use it to open a file named 'MOCK_DATA.csv' input.open("MOCK_DATA.csv"); // check if the file is not open if (!input.is_open()) { // print error message and quit if a problem occurred std::cerr << "Error! No input file found!\n"; exit(1); } std::string dummy; // loop through each line in file for (int i = 0; i < n; i++) { getline(input, dummy, ','); // read until first comma array[i].surname = dummy; // write to array getline(input, dummy, ','); // read until next comma array[i].sid = std::stoi(dummy); getline(input, dummy, ','); // read until next comma array[i].test1 = std::stoi(dummy); getline(input, dummy, ','); // read until next comma array[i].test2 = std::stoi(dummy); getline(input, dummy, ','); // read until next comma array[i].test3 = std::stoi(dummy); getline(input, dummy); // for the last element, read until // end of line (default) array[i].test4 = std::stoi(dummy); } } void print_array(const Student array[], int n) { // just loop through array and print to terminal for (int i = 0; i < n; i++) { std::cout << array[i].surname << " | " << array[i].sid << " | " << array[i].test1 << " | " << array[i].test2 << " | " << array[i].test3 << " | " << array[i].test4 << std::endl; } } |
A struct
has been defined that can be used to store the data for each student. A more
object-oriented approach could have been to create a relevant class. The main()
function is
relatively simple, the number of lines in the data file are counted and a dynamic array of the struct-type is
created to hold the data. This array is then passed into a function and the data from the .csv file read into the
array. Finally, the array is passed into a print function so that it can be printed to the command line.
To count the number of lines, an input file stream is created and the .csv file opened. The
getline()
function is then used inside a while loop. getline()
reads a line from the
input stream into a string. By default it reads until it finds a newline (\n
) character. At this
point, we are not interested in the content of the line, so it is just read into a dummy string and a value
incremented on each loop. At the end of the file, this value will be equal to the number of lines in the file.
Once the dynamic array of the required file has been created, the data in the .csv file is read into the array.
The code loops through each line in the file. By default, getline()
reads until a newline character
is found. However, for CSV data, we wish to read until we find a comma.
getline(input, dummy, ',');
The variables are read into a dummy string and converted to integers when required using
std::stoi()
. Note that for the last element, we want to read to the end of the line i.e. a newline
character and not a comma.
Now the data is in an array, it can be iterated over to analyse and process. It can also be trivially printed to the terminal.
If the above example is run, the following will appear in the terminal.
main.cpp
80 lines read. Cohalan | 771298484 | 67 | 51 | 74 | 51 Vallack | 615278957 | 45 | 57 | 79 | 49 Bungey | 319585891 | 42 | 37 | 57 | 86 Balden | 19478797 | 80 | 90 | 79 | 26 Probet | 701116170 | 58 | 35 | 76 | 98 Baigent | 998970542 | 86 | 61 | 63 | 65 Quarton | 971939596 | 58 | 40 | 100 | 91 Witt | 54978458 | 59 | 99 | 72 | 46 Burnand | 283808017 | 59 | 92 | 51 | 11 Rawsthorn | 814535776 | 63 | 50 | 62 | 100 Vasyatkin | 925395193 | 84 | 73 | 85 | 57 Crossby | 427723213 | 33 | 56 | 100 | 100 Hanham | 597141409 | 74 | 30 | 68 | 60 Scorthorne | 720102259 | 58 | 33 | 100 | 67 Evans | 473955894 | 93 | 96 | 92 | 89 O'Bee | 58336708 | 44 | 11 | 66 | 44 Lewsam | 623498290 | 60 | 58 | 72 | 74 Skim | 412729216 | 70 | 51 | 88 | 84 Feehely | 73095508 | 49 | 50 | 82 | 100 Brandassi | 27384415 | 77 | 59 | 93 | 51 Umbert | 58510748 | 73 | 100 | 78 | 58 Lygo | 140824408 | 55 | 75 | 78 | 85 Schulkins | 200434642 | 54 | 52 | 65 | 37 Scanlon | 962865222 | 39 | 49 | 55 | 85 Allridge | 115735549 | 75 | 52 | 70 | 27 Batchan | 333316640 | 71 | 39 | 94 | 54 Purveys | 583193483 | 46 | 74 | 59 | 66 Shemilt | 374260199 | 58 | 88 | 85 | 32 Conant | 640666260 | 39 | 51 | 89 | 64 Botright | 292406035 | 77 | 27 | 82 | 95 Schulkins | 152835084 | 60 | 22 | 70 | 45 Domke | 624409539 | 61 | 71 | 48 | 51 Elies | 634033768 | 77 | 100 | 56 | 100 Larham | 145602942 | 63 | 32 | 81 | 29 Vacher | 498955516 | 72 | 14 | 73 | 21 Di Filippo | 711731193 | 65 | 65 | 89 | 64 Blench | 496815213 | 40 | 24 | 73 | 79 Goodread | 10007942 | 50 | 42 | 66 | 56 Minchin | 988677183 | 94 | 14 | 69 | 8 Orrock | 509070758 | 69 | 54 | 85 | 69 Wyles | 154625310 | 49 | 25 | 86 | 48 Tookill | 193475472 | 23 | 62 | 53 | 28 Wimmer | 853474046 | 45 | 37 | 83 | 97 Spira | 979696276 | 14 | 54 | 93 | 61 Felix | 875835071 | 63 | 2 | 71 | 81 Reckhouse | 148144410 | 71 | 88 | 77 | 59 Le Breton De La Vieuville | 558486912 | 47 | 41 | 81 | 57 Horrigan | 812230116 | 45 | 52 | 80 | 64 Le Port | 14113685 | 67 | 34 | 91 | 83 Spitell | 43792447 | 40 | 88 | 83 | 79 Dils | 538529529 | 84 | 88 | 55 | 48 Goodge | 602790717 | 100 | 50 | 67 | 39 Vern | 235405979 | 95 | 51 | 62 | 40 Death | 436211337 | 62 | 2 | 93 | 20 Giacomello | 790772359 | 57 | 18 | 80 | 68 McAvey | 843110330 | 61 | 17 | 53 | 74 Sevitt | 711438118 | 61 | 40 | 69 | 76 Utridge | 13821354 | 69 | 51 | 74 | 56 Calleja | 304154084 | 82 | 69 | 72 | 74 Barenskie | 545937322 | 57 | 72 | 70 | 75 Beckingham | 380600744 | 61 | 82 | 61 | 33 Coggen | 520808387 | 63 | 0 | 100 | 61 Christene | 5768811 | 74 | 38 | 100 | 53 Hodgen | 248294860 | 81 | 18 | 73 | 87 Bisco | 203249260 | 43 | 45 | 54 | 68 Woodruff | 721092637 | 65 | 68 | 76 | 90 Binham | 298643920 | 73 | 60 | 76 | 49 Ivasechko | 199652751 | 31 | 16 | 78 | 92 Gecke | 19609957 | 57 | 29 | 53 | 34 Izakof | 347786735 | 57 | 14 | 48 | 9 Pringour | 210495355 | 71 | 11 | 100 | 74 Burgum | 66120460 | 32 | 67 | 93 | 76 Thomel | 827027608 | 73 | 54 | 52 | 42 Doldon | 700489818 | 69 | 73 | 71 | 58 Benyon | 26302393 | 76 | 86 | 89 | 80 Norvel | 10489034 | 45 | 52 | 68 | 65 Mullard | 111450702 | 85 | 56 | 92 | 63 Phelp | 395748563 | 62 | 60 | 57 | 84 Androli | 746030914 | 55 | 64 | 100 | 71 Vinall | 758604004 | 86 | 60 | 60 | 27