----BEGIN CLASS---- [12:42] #startclass [12:42] rtnpro: the stage is yours [12:43] roll call :tabrez khan [12:46] anyone else? [12:46] Anu Kumari Gupta [12:49] devesh verma [12:52] cool, we have a few folks here. I will start the session now [12:53] Today's topic is file handling with Python [12:54] A file is some information or data which stays in the computer storage devices [12:55] You must be interacting with files day in and day out already, e.g. music files, video files, images, source code, etc. [12:56] Files can be broadly classified into two categories: text files and binary files [12:56] Text files contain plain text content, readable by humans [12:57] Binary files contain binary data, readable only by computers [12:57] Anyone any questions, so far? [13:00] There are a few basic file operations: read(r), write (w) and append(a) [13:00] “r” -> open read only, you can read the file but can not edit / delete anything inside [13:00] “w” -> open with write power, means if the file exists then delete all content and open it to write [13:00] “a” -> open in append mode [13:01] We use the `open()` function in Python to open a file. The default mode to open a file is `read` [13:02] Let's say there's a file hello.txt, and I want to open it for reading, all I need to do is: [13:02] f = open('hello.txt') [13:03] now, to read the contents of the file into a string, we can do [13:03] s = f.read() [13:03] once, we are done reading the file, we shall close it [13:03] f.close() [13:03] devesh_verma, ann, can you try it out? [13:05] Yes [13:07] Now, there ware various ways to read a file [13:07] f.read() reads all the content of the file into a variable (or in memory) [13:08] While this is OK for small file content, but not that great for big files, say in Gbs. [13:09] f.readline() reads a line at a time [13:10] Let's say our file content is "line1 [13:10] typo [13:10] Let's say our file content is "line1\nline2\n\nline3\n" [13:11] f.readline() will first return "line1", a second f.readline() will return "line2", then "", and finally "line3" [13:12] ! [13:12] next [13:13] It prints \n as well at the end [13:13] for extra line it prints '\n' and not "" [13:14] That's expected [13:14] ok [13:15] There's another way to read all the lines of a file [13:15] f.readlines() [13:16] it returns a list containing all the lines of the file [13:16] In our case, f.readlines() will return ['line1\n', 'line2\n', '\n', 'line3\n'] [13:17] One point to not is that you need to be careful when to use readlines(), since it loads all the lines of the file in memory, it's not memory efficient [13:18] A better way to read the lines of a file is to loop over the file object `f`, it's much better in terms of memory usage [13:18] f = open('sample.txt') [13:20] for line in f: [13:20] print(line, end='') [13:20] This will output: [13:20] https://www.irccloud.com/pastebin/IpqNfM0D/ [13:21] Any questions? [13:22] so for larger files, this is the most efficient one? [13:23] yes [13:23] it does not load the entire file at a time, but loads it line by line [13:24] Let's do an assignment here [13:26] `cat` is a Linux command which prints the contents of a file. `cat ` will print the contents of the file [13:26] Let's try to create a Python script that does something similar [13:26] ! [13:27] Create a Python file `cat.py` which when executed asks you to enter the name of the file you want to read, and then prints the contents of the file to console [13:28] next [13:29] f.readline() works on interpreter while it does not print anything using script.why? [13:30] f.readline() returns a string, you will need to print that to see the value printed to console [13:31] print(f.readline()) [13:35] yes [13:36] ann, devesh_verma, prokbird, done with the assignment? [13:39] done [13:40] could you folks share your code so that I can review? [13:41] rtnpro: sorry I am travelling in bus :( [13:41] https://paste.fedoraproject.org/paste/VDCYsO4uNS0vp1eV596Uqw [13:43] LGTM [13:44] There's another cool construct to read a file, using the `with` statement [13:44] https://paste.fedoraproject.org/paste/88R-WnyLmbOyWOp60~36jg [13:45] https://www.irccloud.com/pastebin/VC3uDwhZ/ [13:45] Awesome, ann [13:46] the advantage of the `with` construct is that it takes care of closing the file for you [13:47] https://paste.fedoraproject.org/paste/5kyG2Gyvla40g1iq6ykOiQ [13:48] palnabarun: your solution looks cool, but with the `strip()` operation, you might not end up printing the file as it is. [13:48] rtnpro, I tried without it at first. It was printing extra lines. [13:49] rtnpro, `.strip()` removed the extra `\n`'s [13:49] palnabarun: got that, you will need to do some string manipulation to get around that. But, you got the concept of reading a file, that's important. [13:50] Let's look into how to write files [13:52] to open a file for writing, we need to supply a second argument to the `open()` funtion: `"w"` [13:53] Opening a file in write (`w`) mode will erase all the contents of the file and write new content [13:54] f = open('ircnicks.txt', 'w') [13:54] f.write('palnabarun\n') [13:54] f.write('ann\n') [13:54] f.write('prokbird\n') [13:55] f.close() [13:55] the `with` construct works here as well [13:56] Now, that you know how to read and write a file, let's implement `copy.py` script which copies file from source file to destination file [13:57] Usage: `copy.py file1 file2` should copy file1 contents to file2 [14:01] https://paste.fedoraproject.org/paste/jSKvk86WVmCssPcfTyIR-g [14:02] https://paste.fedoraproject.org/paste/J-fvS9cTMo3oqJN-Ib7qfA [14:03] palnabarun: what if you computer's memory is 256MB and the source file is of 5 GB [14:03] rtnpro, that's what I am improving right now. [14:03] :) [14:03] ann: you forgot to close the files [14:04] https://paste.fedoraproject.org/paste/0dNnY~J-bSl0Yg-xIbnAew [14:06] https://paste.fedoraproject.org/paste/b~kop2WUAH9qlA3tt0Fpow [14:06] Is this compulsory to use import sys? [14:07] prokbird, Only if you want to get the arguments passed while running the script. [14:07] prokbird, Or for other purposes like exiting the program. [14:08] palnabarun, what if we mention the file name in program itself rather than passing ? [14:09] ann: the file closures should be always executed, when you open a file. It should be at the same level as `open` calls, outside the `if` block [14:09] prokbird, Not needed, but then you can't specify the filenames as arguments. [14:09] palnabarun, thanks :) [14:10] oh yes, ok [14:10] prokbird, :) [14:10] prokbird: in real life we don't hard code things in the application, we try to make it generic. We pass the values either as sys args or command line args, or config file [14:11] How can we know which module to use ? [14:12] It depends on what you are trying to achieve [14:12] I mean is there any list from we can know? [14:12] ! [14:13] sys.args based implementation is quite basic and an implementation with it won't look like a cool CLI command [14:14] There's the optparse library in Python which is quite useful in making real world CLI applications, https://docs.python.org/3.7/library/optparse.html [14:14] next [14:14] how do we use config files? Can you show one example [14:15] ! [14:16] ann: this is out of scope for today's session :) [14:17] next [14:17] ! [14:17] You suggested using optparse. But the doc page says it's deprecated and suggests argparse. So, is there any reason to use it even now? [14:19] Yes, use argparse [14:19] rtnpro, okay. thanks [14:19] Let's end today's class [14:19] You can try playing with files operations more [14:20] Feel free to get back if you have any doubts or questions [14:20] .endclass ----END CLASS----