My Blogs

Python Automation

In this (beginner-friendly) article, you will learn how you can make your own Python 🐍 script for organizing files for your own workflow just the way I did.

What problem did I fix? πŸ€”


I code a lot of Python 🐍 snippets, surf videos, and work on projects which require me to download a lot of Images, Videos, and whatnot!

I don't get time to organize my downloads πŸ“‚ and if I want to look for a file...

I can't find the file! (screams angrily)

So, I wrote a Python script to automatically organize my files in one go. ✨✨

After writing the script, it's a lot easier for me to look for anything, cause every file is organized and belongs to their own πŸ“‚πŸ“‚, which saves me a lot of time now.

My Mental Health is So Much Better now!

The IDEA πŸ”₯

A Script which Moves files to their respective πŸ“‚βœ¨

For implementing that, here's what I did (abstract view):

  • Created a dictionary πŸ“” which contains the file types (extensions) and paired them with their respective πŸ“‚ names

  • Took input (path) from the User via the command-line argument πŸ’»

  • Looked for files in that path and made a mechanism (function) for getting their respective πŸ“‚ names.

  • Moved πŸ–Ό, πŸ“½, 🎡,πŸ—’οΈ etc to their πŸ“‚πŸ“‚πŸ“‚

See, it's simple!

Steps for Writing our (Magical ✨) Script

Step #1: Creating a dictionary πŸ“” of Extensions (suffix) and Directories πŸ“‚

This part is the most time consuming (and the easiest) 😁, we just need to

Create a dictionary πŸ“” (Hashmap) where keys are extensions of the files and their corresponding values are the name of Directories we want to move them in.

Note: You can relate Dictionaries in Python 🐍 to Objects in JavaScript.

So, Every multimedia file has an extension:

  • Image files have extensions .jpeg, .png, .jpg, .gif...

  • Video files have extensions .mp4, .mov, .mkv, .webm...

  • Document files have extensions .doc, .docx, .csv, .pdf...

And so on...

So, we can Google the extensions of Images, Videos, Documents, Program Files, and whatever kind of files we need to organize and store them in a dictionary in {"extension": "directory"} pair in a variable named dirs like this...

dirs = {
                    # Images
                    "jpeg": "Images",
                    "png": "Images",
                    "jpg": "Images",
                    "tiff": "Images",
                    "gif": "Images",
                
                    # Videos
                    "mp4": "Videos",
                    "mkv": "Videos",
                    "mov": "Videos",
                    "webm": "Videos",
                    "flv": "Videos",
                
                    # Music
                    "mp3": "Music",
                    "ogg": "Music",
                    "wav": "Music",
                    "flac": "Music",
                
                    # Program Files
                    "py": "Program Files",
                    "js": "Program Files",
                    "cpp": "Program Files",
                    "html": "Program Files",
                    "css": "Program Files",
                    "c": "Program Files",
                    "sh": "Program Files",
                
                    # Documents
                    "pdf": "Documents",
                    "doc": "Documents",
                    "docx": "Documents",
                    "txt": "Documents",
                    "ppt": "Documents",
                    "ods": "Documents",
                    "csv": "Documents"
                }
                

Step #2: Taking input (path) from the user πŸ§‘πŸ»β€πŸ’»πŸ‘©πŸ»β€πŸ’» via a command-line argument πŸ’»

What are command-line arguments?

Here's a simplified version for those who don't know what a command-line argument is:

when you run cd command for changing directory and give it a path like this...

cd Documents/TOP_SECRET_STUFF
                

OR

ls Downloads
                

that path is called a command-line argument.

Basically, command-line arguments are values that are passed onto the programs or scripts being executed.

Now, getting back to our script πŸ“ƒπŸ‘‡πŸ»

Python has a module named sys which has a function argv which returns a list of the command-line arguments given while executing the script.

Here's how it works...

When you run a python script like this...

python helloWorld.py
                

argv returns a list that contains all the command line arguments given.

In the above example, the first (and only) command-line argument is "helloWorld.py" which is the name of the script file.

argv returns ['helloWorld.py'].

If we give command-line arguments like this...

python add.py 2 3
                

argv will return ['add.py', '2', '3'].

Let's use it for taking a path from the user in our script file.

from sys import argv
                
                # argv = ['our_script_file.py', 'path/to/dir']
                
                # Select second element (at index 1) as path
                path = argv[1] # 'path/to/dir'
                

There's a tiny bug in it.

If the user runs the script without providing the path, then our script will throw Index error (cause we are trying to access 2nd element but there won't be any 2nd element to access)

Bugs meme

Though, we can easily fix that bug by

adding a condition before accessing the path from argv

That simple condition is to check if the command-line arguments are exactly 2 (name of the script file and the path).

If this isn't the case then display the Error message along with Usage like this...

from pathlib import Path
                
                if len(argv) != 2:
                    print("=" * 35)
                    print("[ERROR] Invalid number of arguments were given")
                    print(f"[Usage] python {Path(__file__).name} <dir_path>")
                    print("=" * 35)
                    exit(1) # terminate the execution of the script with code 1
                
                # Path to Organize
                PATH = argv[1]
                

And... It's fixed! 😁😁😁

Note:

  • __file__ contains the absolute path of the script file (being executed)

  • Path(__file__) returns a Path object (the next section digs more into it), and with this object, we can simply print the name of the script file by using the name data member with this Path(__file__).name

Now we know how we can take input from the command-line!! πŸ”₯πŸ”₯πŸ”₯

Step #3: Scan the path for files and move them into their folders

Python has an AWESOME 😎 module called pathlib which provides Classes for working with paths.

It's so cool that it can handle paths from different Operating Systems! 🀯

For organizing files, we need to...

  • πŸ” Look for files in the path πŸ›£
  • Check each file's extension 🧐 (file type)
  • Select a destination πŸ—ΊοΈ based on that extension
  • Move βœ‚οΈ files to their destination directory

Let's do this! meme

πŸ” Looking for files in the path πŸ›£

For doing that, we gonna convert the path (provided via command-line argument) into a Path Object with the help of Path class.

For doing that, we import Path class like this...

from pathlib import Path
                

and then use the path stored in the 1st index of argv (2nd element) and call the Path class using that

from pathlib import Path
                
                PATH = Path(argv[1]) # argv[1] = 'path/to/organize'
                

For looking up the files in that directory, we gonna iterate it with the help of the iterdir method.

from pathlib import Path
                
                PATH = Path(argv[1]) # 'path/to/organize'
                
                for filename in PATH.iterdir():
                    # Do Magic (Organize)
                
                

This is gonna help us iterate files and folders in any path! πŸ˜ƒ

Cause we want our script to organize only files, we need to check if the filename is a file or a directory. We can easily do that with the is_file method.

from pathlib import Path
                
                PATH = Path(argv[1]) # argv[1] = 'path/to/organize'
                
                for filename in PATH.iterdir():
                    if filename.is_file(): # If filename is a file
                        # Do Magic (Organize)
                
                

Checking each file's extension 🧐 (file type)

Cause we are working with Path objects, it makes it pretty simple for us to get the extension of a file.

Let's say filename is a Path object. To get the extension name, It's stored in the suffix member of the Path object.


                # filename = Path object with path value "AWESOME.jpeg"
                extension = filename.suffix # .jpeg
                

cause we need that extension without '.' as a prefix, we gonna remove that with string slicing.


                # filename = Path object with path value "AWESOME.jpeg"
                extension = filename.suffix[1:] # 'jpeg'
                

Now, we know how to get extension names! πŸ˜„πŸŽ‰

Selecting a destination πŸ—ΊοΈ based on that extension

The heading says it all.

For selecting a destination, we will use the dictionary we created earlier (long key: value pair thingy) and use it for getting the right destination name for every single file.


                # filename = Path object with path value "AWESOME.jpeg"
                extension = filename.suffix[1:] # 'jpeg'
                directory_name = dirs.get(extension) # 'Images'
                

Cool 😎! But what if the extension is not in our dictionary 😐 It's gonna return None and we don't want None. Do we?

So, let's add a default value "Miscellaneous" as a directory name for files that are not in our dictionary.


                # filename = Path object with path value "New_IMG.ppm"
                extension = filename.suffix[1:] # 'ppm'
                directory_name = dirs.get(extension, 'Miscellaneous') # 'Miscellaneous'
                

We are almost done

cause we gonna use this code, again and again, we can wrap it in a function.

Gonna name it get_dir cause it's gonna return directory name for every file it receives as input. Makes sense. Right!?

Also, adding a little docstring never hurts...


                def get_dir(filename):
                    """
                        This function takes filename and returns name of the
                        parent directory of the respective filename
                        Returns Miscellaneous if Parent is not found
                    """
                    ext = filename.suffix[1:]
                    return dirs.get(ext, "Miscellaneous")
                

Move βœ‚οΈ files to their destination directory (finally! πŸ˜ƒ)

We gonna use themove function from shutil (shell utilities..?) module (Yes, another dope module in Python) for moving files from one place to another.

Moving files is super easy!. Here's an example.


                from shutil import move
                
                source_path = '/home/Downloads/Awesome.jpeg'
                destination_path = '/home/Downloads/Images'
                
                move(source_path, destination_path) # self explanatory
                

It's similar to executing this command in Bash

mv ~/Downloads/Awesome.jpeg ~/Downloads/Images
                

Moving files meme

Source: memegenerator.net

For getting path from a Path object, we use str function which will return path in string instead of a Path object.

We gotta use it cause move function takes in

move(
                  str(source_path),
                  str(destination_path)
                )
                

Create a destination directory for files to be moved in

We gonna use the get_dir function we defined earlier for creating a destination directory for each file by doing the following:

destination = PATH / get_dir(filename)
                
                if not destination.exists():
                    destination.mkdir()
                

We made a new Path object with the path we gave via the command-line and joined that with our selected directory name using PATH / get_dir(filename) and stored it in destination variable.

Also, we need to make it if it doesn't exist, which is done by exists and mkdir method.

Note:

  • "~" contains the path to the home directory
  • We need an absolute path for moving files (Absolute Path refers to the full path to a file/directory)
    • For getting an absolute path, we use the absolute method of Path object.
    • filename.absolute() returns Path object with absolute source path of the file
  • pathlib (kinda) supports moving files too! But we used shutil cause it mimics the behavior of mv command and It doesn't have any issues doing its job.

Now, we know everything we need for implementing the (Magical ✨) Script! 😎πŸ”₯

πŸ”₯πŸ”₯πŸ”₯πŸ”₯

LET'S PUT EVERYTHING TOGETHER

WE HAVE LEARNED/DID SO FAR

πŸ”₯πŸ”₯πŸ”₯πŸ”₯

from pathlib import Path
                from sys import argv
                from shutil import move
                
                
                def get_dir(filename):
                    """
                        This function takes filename and returns name of the
                        parent directory of the respective filename
                        Returns Miscellaneous if Parent is not found
                    """
                    ext = filename.suffix[1:]
                    return dirs.get(ext, "Miscellaneous")
                
                
                dirs = {
                    # Images
                    "jpeg": "Images",
                    "png": "Images",
                    "jpg": "Images",
                    "tiff": "Images",
                    "gif": "Images",
                
                    # Videos
                    "mp4": "Videos",
                    "mkv": "Videos",
                    "mov": "Videos",
                    "webm": "Videos",
                    "flv": "Videos",
                
                    # Music
                    "mp3": "Music",
                    "ogg": "Music",
                    "wav": "Music",
                    "flac": "Music",
                
                    # Program Files
                    "py": "Program Files",
                    "js": "Program Files",
                    "cpp": "Program Files",
                    "html": "Program Files",
                    "css": "Program Files",
                    "c": "Program Files",
                    "sh": "Program Files",
                
                    # Documents
                    "pdf": "Documents",
                    "doc": "Documents",
                    "docx": "Documents",
                    "txt": "Documents",
                    "ppt": "Documents",
                    "ods": "Documents",
                    "csv": "Documents"
                }
                
                if len(argv) != 2:
                    print("=" * 35)
                    print("[ERROR] Invalid number of arguments were given")
                    print(f"[Usage] python {Path(__file__).name} <dir_path>")
                    print("=" * 35)
                    exit(1)
                
                # Directory Path
                PATH = Path(argv[1])
                
                for filename in PATH.iterdir():
                
                    path_to_file = filename.absolute()
                
                    if path_to_file.is_file():
                        destination = PATH / get_dir(filename)
                
                        if not destination.exists():
                            destination.mkdir()
                
                        move(str(path_to_file), str(destination))
                

Finally, Running the (magical) script πŸ”₯

For organizing my downloads folder, I can now simply run the following command...

python organize.py ~/Downloads
                

Making this long command shorter with alias

I created an alias that executes the above command so I don't have to write python ~/projects/File-Organizer/organize.py for executing the script from any directory.

Here's how I now organize files in a path

organize ~/Downloads
                

SO CLEAN! 😁