Before discussing file IO and json, let’s talk about accepting input from the user using input(). Example:

name = input("What's your name?\n")
age = int(input("Age?\n"))
print("Name : {}  Age : {}".format(name,age))
#output
What's your name?
plog
Age?
42

Reading and writing files involves the following three steps:

  • Open the file using open(filename, mode)
  • Read or write from or to the file using read() or write()
  • Close the file using close()

A file can be opened in the following modes:

  • r for read only. This is the default mode if mode is not specified
  • w for write only. This mode overwrites any existing content
  • r+ for read and write
  • a for appending content to a file

Example:

def print_file(filename):
    """Prints the contents of a file"""
    f = open(filename)
    print(f.read())
    f.close()

filename = "temp.txt"

#write to a file
temp_file = open(filename,'w')
temp_file.write('First line.\nSecond line.\n')
temp_file.close()

print_file(filename)

#append to file
temp_file = open('temp.txt','a')
temp_file.write('Third line')
temp_file.close()

print_file(filename)
#output
First line.
Second line.

First line.
Second line.
Third line

A context manager lets us open a file with the guarantee that the file will be closed even if an exception is raised. Example:

with open('workfile', 'r') as f:
    read_data = f.read()
print(f.closed)
#output
True

Some common exceptions to handle when dealing with files:

  • FileNotFoundError
  • FileExistsError
  • PermissionError

The os module is used to interact with the operating system.

The function, os.path.join() is used to generate a string to represent a path. The generated path is dependent on the OS it is being run on. Eg- Windows uses backslash in paths while *nix OSes use a forward slash.

print(os.path.join('home','plog','Desktop'))
#output
home/plog/desktop

os.getcwd() is used to retrieve the current working directory.

print(os.getcwd())
/home/plog/Desktop/python_learn

os.chdir(path) is used to change directory.

os.chdir('modules')
print(os.getcwd())
/home/plog/Desktop/python_learn/modules
os.chdir('../')
print(os.getcwd())
/home/plog/Desktop/python_learn

To specify a path, relative or absolute paths can be used. . specifies the current directory and .. specifies the parent directory.

os.mkdir(name) is used to create a directory. os.path.exists(path) is used to check if the path exists to avoid to decide whether to create the directory or not.

if not os.path.exists('./files_demo'):
    os.mkdir('./files_demo')

os.path.getsize(path) is used to find the size of the file in the specified path.

os.listdir(path) lists all the folders and subfolders in the path specified.

os.path.isdir(path) and os.path.isfile(path) are used to verify if the specified paths are a directory and a file respectively. It returns True or False.

print(os.path.getsize('./temp.txt'))
print(os.listdir('./'))

print(os.path.isdir('./files_demo'))
print(os.path.isfile('./temp.txt'))
#output
35
['.git', 'requirements.txt', 'io.py', 'files_demo', 'strings.py', 'range.py', 'lists.py', 'LICENSE', 'datat
ypes.py', 'tuples.py', 'control_flow.py', 'hello.py', 'sets.py', 'dictionaries.py', 'functions.py', '.gitig
nore', 'module_demo.py', 'classes.py', 'temp.txt', 'README.md', 'modules']
True
True

The shelve module is used to save variables to a file and retrieve them. It stores the contents in binary format.

shelf = shelve.open('shelve_test')
langs = ['python', 'java', 'php']
shelf['langs'] = langs
shelf.close()

shelf = shelve.open('shelve_test')
print(shelf['langs'])
shelf.close()
#output
['python', 'java', 'php']

shutil is a module that lets us move, copy, rename and delete files and folders.

#copy to directory
shutil.copy('./temp.txt','./files_demo')
#copy and entire folder recursively
shutil.copytree('./modules','./modules_backup')
#move a file
shutil.move('./temp.txt','./files_demo/temp_backup.txt')

if not os.path.exists('./files_demo_2'):
    os.mkdir('./files_demo_2')

#move a folder
shutil.move('./files_demo_2','files_demo')
shutil.move('./shelve_test','./files_demo/files_demo_2')

Deletion of files and folders can be done using the os and shutil modules:

  • os.unlink(path) deletes a file
  • os.rmdir(path) deletes an empty folder
  • shutil.rmtree(path) will delete a directory along with all files and subfolders

Since deletion of files and folders is a risky operation, the above functions can be replaced by the send2trash module. The mentioned module moves the files and folders to be deleted into the trash so it gives us the chance to restore files and folders in case something unexpected happens.

Install the send2trash module using:

pip install send2trash

Make sure to update the requirements.txt using

pip freeze > requirements.txt

The following example shows the usage of send2trash:

send2trash.send2trash('./files_demo')
send2trash.send2trash('./modules_backup')

A commonly used operation is to traverse a folder and all its files and subfolders recursively. This can be accomplished using os.walk(path):

for folderName, subfolders, filenames in os.walk('./'):
    print('The current folder is ' + folderName)

    for subfolder in subfolders:
        print('SUBFOLDER OF ' + folderName + ': ' + subfolder)
    for filename in filenames:
        print('FILE INSIDE ' + folderName + ': '+ filename)

    print('')

The above code prints all the folders, subfolders and files in the current directory. I did not paste the output, since it includes the contents of .git which will unnecessarily clutter this plog.

The json module enables us to convert strings to json and vice-versa.

Example:

json_string = '{"first_name": "Guido", "last_name":"Rossum"}'
parsed_json = json.loads(json_string)
print(parsed_json['first_name'])
print(json.dumps(parsed_json))
#output
Guido
{"last_name": "Rossum", "first_name": "Guido"}

json.load() and json.dump() also exist, which accept files instead of strings as arguments.

The reason shelve is preferred over serializing and deserializing objects as json is because of performance. Serializing and deserializing json is an expensive operation for custom objects. Json can be used for dictionaries and lists.

Source code for today’s plog is here.

References: