Today was all about loops and data structures. As it turns out, this is a topic with which I'm pretty comfortable, and one which I've used a bit in project Euler and project Rosalind. I got a little hung up on a set of nested dictionaries, and aggregating all the numbers hidden in values in keys on various levels.
Instead, I used my extra work time to build a small debugging tool that displays all the variables, their types, and their values at a give point in a script. The idea came from a moment in lecture about following a variable through a script and keeping track of the type and value associated with a particular namespace by using a table:
a = 5
b = a
c = str(a)
name type value
a int 5
b int 5
c str '5'
(yeah, it's a trivial example, but its 930 pm after a long day and still have some things to read for tomorrow).
To automate this process for more complex code, I wrote a series of functions that would collect the variables currently in use (using dir(), which supplies the names in use), remove the 'magic' variables since those clutter the table of variables declared by the user, and collect the type and value of said variables, and then display that all in a table as a print-out in the command line. (credit to Jonathan for introducing the table display and Jeremy for suggesting the dir() function as a means of collecting variable names).
The first version of this was assembled at the end of one of the coding challenges. It was a mess, with global variables called in multiple places. Plus, having this code at the end meant that any error that occurred higher in the script would would never be caught by my variable-catcher.
The next iteration cleaned up the global variables into functions...whereupon I discovered that dir() called within a function finds only the local variables. Version 0.3 saw all the function definitions moved to the top of the script (so they would be available when called anywhere below). The whole process was now called in a function 'print_table' that took dir() as it's only parameter. That way, "print_table(dir())" could be inserted at an arbitrary point and dir() would harvest the names in the scope in which it was called. The list generated could then be processed and displayed.
Version 0.4 saw the various function definitions moved to an independent file (error_checker.py on github). Now, instead of having to copy my code into a problematic python script, you can simply copy the error_checker.py file into the same directory, and add "from error_checker import *" to your code. Finally, just type "print_table(dir())" at the point where you wish to see the names in use and the table will be printed to the terminal.
Improvements I'd like to make: output to a csv file instead of terminal, dynamically catch the 'magic' variables that should be removed, and a few cleaner implementations of search functions.
So, hardly perfect, but I'm pretty happy with having gotten it to work so well, mostly on my own (beyond the dir() suggestion, and some suggestions for making the code cleaner and more compact), and entirely in the time after the main challenge work of the day was completed.
No comments:
Post a Comment