----BEGIN CLASS---- [13:26] #startclass [13:26] Roll call please [13:26] Ashwani Kumar Gupta [13:26] yurii pylypchuk [13:26] Krishnanand Rai [13:26] Sandesh Patel [13:27] Today, we will do a code review of ashwanig repository https://github.com/ashwani99/dgplug-python-exercises [13:28] *ashwanig's [13:28] Everyone please open the repo for reference [13:28] During a code review process, it is important that we focus on the code and not on the person [13:29] As a critic, we should not directly say anything against ashwanig, but, reflect on the code only [13:29] This is very important [13:29] So, let us get to the details [13:30] 1. The README.md tells what this repo is, but, it does not tell us whom to contact. Always, provide a contact in the README file [13:30] 2. It is very nice to have included a LICENSE file [13:31] Depending on the license you use, you should include the license as a header in the code, or can keep it as a separat file in the project sources [13:31] https://github.com/ashwani99/dgplug-python-exercises/blob/master/LICENSE is nicely done! [13:32] 3. The README should also say how to use the project. [13:32] 4. s/python/Python/. For nouns, always start with capital letter [13:32] Some projects will have a separate INSTALL file that gives you instructions on how to install the software for use [13:33] But, these are important to have as part of any project [13:34] 5. The different exercises have been neatly organized into folders. But, after a break or period of time, nobody will know what "Problem 9" or "Problem 6" is [13:34] The repo already says they are exercises. So, the project folders could have been given meaningful names [13:35] For example, instead of "Problem 1" to "1-operation-on-two-numbers" [13:35] 6. The use of whitespaces in file or folder names comes from bad "Windows" experiences. In *nix, avoid filenames with spaces [13:35] In *nix, when I say, file it also means a directory [13:36] ! [13:36] next [13:36] What type of contact information should we include? email? [13:37] ashwanig, good question; e-mail is a MUST [13:37] ashwanig, avoid giving phone numbers; bots can scan these public repos and hence you could consider obfuscating the e-mail address as well [13:37] ashwanig, something like author AT shakthimaan DOT com [13:38] ashwanig, a person using your project or wants to contribute to your project will need to be able to contact you [13:38] mbuf, Ihave seen some repos provide blog links/github profile links as contact information [13:38] they shouldn't have to open the git commit to see the e-mail address [13:38] ashwanig, those are additional and nice to have [13:38] okay [13:38] ashwanig, there is a difference between essentials, nice to have and luxury [13:39] ashwanig, a person who has cloned your repo to work offline will want all the information contained within it [13:39] ashwanig, they shouldn't have to go and look elsewhere to find information; that is the point I am trying to make [13:39] mbuf, understood [13:40] The reason we say that you should avoid filenames with whitespaces is that in the shell environment, you will have to enclose them within double quotes or escape it with \ to use it [13:40] and this is not good. With shell environments and auto-completion, using dash to separate keywords is good [13:41] 7. Some projects keep a CHANGELOG file or release notes so users can quickly refer to that. Since there are no releases made in this project this is fine [13:41] Depending on your project, you may decide to use one [13:41] ! [13:42] Let us open "Problem 1" folder [13:42] next [13:42] Can you show us an example for point number 7 [13:42] ashwanig, https://github.com/emacs-mirror/emacs/blob/master/ChangeLog.1 [13:43] In Problem 1, we see two files, the problem statement and the solution [13:45] 8. You can put the problem statement as comments in the solution itself so that there is a local reference [13:45] Of course here, the problem statement is simple [13:45] Now, open https://github.com/ashwani99/dgplug-python-exercises/blob/master/Problem%201/solution.py [13:46] Because of the whitespace, even the GitHub URL uses %20 for the blank space. If you have plenty of white spaces in the filename, you will have lot of %20 and the URL will not be readable [13:47] These are minute details, but, you need to pay attention to them [13:47] 9. If you decide to put the license in the code itself, you will also put your name, Copyright, year information as a comment header in the top of the file [13:48] ! [13:48] next [13:48] Difference between changelogs & the git log? [13:48] Python enforces indentation and hence, you can see the code is neatly aligned [13:48] vharsh1, the changelog is just a plain text file; it is not dependent on any revision control tool [13:48] Why is it good to have a CHANGELOG when our code is already versioned? [13:49] vharsh1, you need to have Git installed to run "git log" [13:49] vharsh1, Release notes are important for any software product because customers are first going to read it [13:50] vharsh1, a consumer of a software product need not necessarily be a developer; so, having the history of changes tells what has changed in the project [13:50] Ok, but isn't that just expected of all people who are taking a look at our repository? [13:50] Plus, they can see the commits/tags ? [13:50] [13:51] vharsh1, sure; see https://en.wikipedia.org/wiki/Changelog [13:51] vharsh1, for any software product or project that you release you will need to create at least two guides [13:51] vharsh1, one is the User Guide and the other is the Developer Guide [13:51] I mean, commits/tags can be seen on github gitlab, etc. [13:52] vharsh1, the user guide will be useful for users on bug fixes and new features, they might not be interested in the nitty gritty details; the developers will use git log, for example, to see what has changed [13:52] vharsh1, you are talking like a developer; I am addressing the view point as a user as well [13:52] vharsh1, it is viewing the project from a different perspective [13:52] Hmm, I got it :) [13:53] vharsh1, if your implementation is a server, then you need to write an Administratior's guide as well [13:53] vharsh1, if the software implementation is a library, then a Reference Manual needs to be written [13:53] all these are part of the project [13:54] https://en.wikipedia.org/wiki/User_guide [13:54] good questions :) [13:54] coming back to solution.py in Problem 1 [13:55] ! [13:55] can we think of changelogs as summary of important commits? [13:55] oops [13:55] ashwanig, sure [13:56] 9. It line #1, number_1 input is obtained from the user. Whenever any user input is fetched, some validation needs to be done [13:56] This is a trivial example, but, nevertheless, never trust any user input [13:56] The same applies to line #2 [13:57] and line #3. In general, you should have some basic validation to accept or reject the user input, before you do any operation on the input data [13:57] 10. Reading the input could have been written as a separate function which does only one thing that is to get the required input from the user [13:58] 11. It is a good practice to have a mandatory else clause as we see in line #13 [13:59] 12. A good CLI program should also have a help option. Or in line #14, instead of printing "Invalid operation", you can also print a help output to tell the user what input is expected [13:59] If you are writing CLI programs, you can have a separate help option [14:00] 13. In line #15, instead of calling exit(), you can give the user the option to try one more time [14:00] Instead of just exiting the program; just a suggestion [14:01] 14. In line #17, the print output can be made more meaningful. You can specify the result is for which operation [14:01] Or, something like, for the given input x and y and operation z, the result is abc [14:02] Try to be meaningful and give good feedback to the user when writing software; this is again a trivial example, but, the basics are the same [14:02] Let us know take another example; ashwanig do you want to pick a problem? [14:03] mbuf, problem 7 [14:04] ashwanig, thanks [14:04] everyone open https://github.com/ashwani99/dgplug-python-exercises/blob/master/Problem%207/solution.py [14:05] mbuf, better if you do Problem 12 [14:05] it is an extended problem of problem 7 [14:06] ashwanig, we can do that next [14:06] okay [14:06] On problem 7 now, we see that the log string assignment is huge, and we need to scroll all the way down to get to the actual code [14:07] 15. One approach to take here is to put the data in a separate file and load it in runtime [14:08] In line #244, what is the meaning of 3 in len(parts) < 3? [14:08] It is not obvious for me at first look [14:08] mbuf, each line of the log is separated by spaces, those I called parts [14:08] 16. The general tendency for people is to write one single main program [14:09] 16. so it is good to describe the input; maybe, in a comment to show what does line look like [14:09] and then when you do a split what does #243 parts look like, so the user can actually see in the comment how the data is modified by the code [14:10] 17. It is good to separate different functionality into small functions. In the *nix philosophy, each function should do one thing and does it well [14:10] So, the splitting of the line could be a separate function. [14:11] In this example, the output is well formattd and the result is clear [14:12] 18. When you split your logic into functions, you can write commentns and docstrings [14:12] http://docs.python-guide.org/en/latest/writing/documentation/ [14:12] 19. In fact, you can also write test in them, or create a separate unit test file to test your functions [14:13] Right now, there are no tests! It is the developers' responsibility to write unit tests [14:13] 20. Learn to use a continuous integration tool to run all the tests for your project periodically so you know that your project builds fine [14:13] ! [14:13] next [14:13] In which cases doctests are preferred? [14:14] ashwanig, your preference; but, the point is you MUST have unit tests [14:15] GitHub, for example, has integration with CI builds and they can tell you that your builds are fine or not [14:16] See https://github.com/dwyl/repo-badges [14:16] http://www.codeblocq.com/2016/04/Add-a-build-passing-badge-to-your-github-repository/ [14:16] https://travis-ci.org/ [14:17] Otherwise, the logic in the problem 7 is simple [14:17] Let us now move to problem 12 [14:18] Now we see funuctions in place [14:18] *functions [14:18] But, no tests are docstrings yet [14:19] This is much cleaner code compared to the previous exercises [14:20] In line #9, what does return mean? [14:20] 21. It is a good practice to be consistent on what we return on success and failure from a function [14:21] If you use functional programming languages, you will write the return types as part of the function definition [14:21] and it is clear and consistent. So, think about return values when you write functions [14:21] mbuf, file is already present, so returning [14:21] And the function does not return anything [14:22] ashwanig, if it does not exist? [14:22] then a new file will be created [14:22] with open(filename, 'w') as fileobj: [14:22] fileobj.write(content) [14:23] ashwanig, so the question is if the requests.get(url).text fails, what happens? [14:23] 22. When doing a network call, things can fail; so, having a try catch block is helpful in this context [14:24] mbuf, I have not handled that case [14:29] So, those were some important points that we learn from this code; thanks to ashwanig for volunteering to have a code review done [14:29] any questions? [14:29] we will try to do more code reviews, but, not for toy problems [14:30] It will be good to see work that can be reviewed for real world projects [14:31] so, if there are no other questions for now, I shall end the class; roll call please [14:31] yurii pylypchuk [14:31] Ashwani Kumar Gupta [14:31] Krishnanand Rai [14:31] Sandesh Patel [14:32] Bhavin Gandhi [14:32] whatever I said is not justn for Python, but, applies to other programming languages too; we are just discussing good practices [14:32] *just [14:32] Thank you mbuf for this session, it is useful ----END CLASS----