All Courses

Python - Directory Listing

Vaibhav Pant

2 years ago

Python - Directory Listing | insideAIML
Table of Contents
  • Introduction
  • Listing Local Directory
  • Listing Remote Directory
  • Listing Files in a Directory
  • Finding Files – Current Directory Only
  • Recursive Descent – Using os.walk()
  • Recursive File Find
  • Conclusion

Introduction

          Python can be utilized to get the rundown of substance from a registry. We can make a program to list the substance of index which is in a similar machine where Python is running. We can likewise login to the remote framework and rundown the substance from the remote index.
To get a rundown of the considerable number of records and envelopes in a specific registry in the filesystem, use os.listdir() in heritage forms of Python or os.scandir() in Python 3.x. os.scandir() is the favored strategy to utilize in the event that you additionally need to get document and registry properties, for example, record size and change date

Listing Local Directory

        In the underneath model we utilize the listdir() technique to get the substance of the current index. To likewise demonstrate the kind of the substance like record or catalog, we utilize more capacities to assess the idea of the substance.

for  name in os.listdir('.'):
    if os.path.isfile(name): 
        print ('file: ', name)
    elif os.path.isdir(name): 
        print ('dir: ', name)
    elif os.path.islink(name): 
        (print 'link: ', name)
    else:
        print('unknown', name)
When we run the above program, we get the following output −

file: abcl.htm
dir: allbooks
link: ulink
If it's not too much trouble note the substance above is explicit to the framework where the python program was run. The outcome will differ contingent upon the framework and its substance. 

Listing Remote Directory

          We can list the substance of the remote catalog by utilizing ftp to get to the remote framework. When the association is set up we can utilize orders that will list the catalog substance in a manner like the posting of neighborhood indexes.

from ftplib import FTP
def main():
    ftp = FTP('ftp.ibiblio.org')
    ftp.login()
    ftp.cwd('pub/academic/biology/') # change to some other subject
    entries = ftp.nlst()
    ftp.quit()

    print(len(entries), "entries:")
    for entry in sorted(entries):
        print(entry)

if __name__ == '__main__':
    main()
When we run the above program, we get the following output −

(6, 'entries:')
INDEX
README
acedb
dna-mutations
ecology+evolution
molbio
  • There are a few techniques to list a catalog in Python. In this article, we present a couple of these alongside the admonitions for each.

Listing Files in a Directory

          The most straightforward approach to get a rundown of passages in a catalog is to utilize os.listdir(). Go in the index for which you need the passages; utilize a "." for the current catalog of the procedure.
for x in os.listdir('.'):
 print x

LICENSE
index.js
API.md
ISSUE_TEMPLATE.md
package.json
test
...
As should be obvious, the capacity restores a rundown of directory section strings with no sign of whether it is a record, index, and so forth. Likewise, '.' and '..' passages are not come back from the call.
On the off chance that you have to distinguish whether the passage is a document, directory, and so forth., you can utilize os.path.isfile() as appeared.
for x in os.listdir('.'):
if os.path.isfile(x): print 'f-', x
    elif os.path.isdir(x): print 'd-', x
    elif os.path.islink(x): print 'l-', x
    else: print '---', x
f- LICENSE
f- index.js
f- API.md
f- ISSUE_TEMPLATE.md
f- package.json
d- test
f- .gitignore
d- .git
f- rollup.config.js
...
Here is a one-liner using filter() to collect the files:
print filter(lambda x: os.path.isfile(x), os.listdir('.'))

['LICENSE', 'index.js', 'API.md', 'ISSUE_TEMPLATE.md', 'package.json', '.gitignore', ...]
Or the directories:
print filter(lambda x: os.path.isdir(x), os.listdir('.'))

['test', '.git', 'img']

Finding Files – Current Directory Only

         Here is a basic joke to discover JavaScript documents in a registry. Note this doesn't plunge the directory progressive system yet just returns the coordinating passages in the predetermined index (see beneath for a formula which does that):
print filter(lambda x: x.endswith('.js'), os.listdir('d3'))

['index.js', 'rollup.config.js', 'rollup.node.js']
Using list comprehension, the above can also be written as follows:
print [x for x in os.listdir('.') if x.endswith('.js')]


['index.js', 'rollup.config.js', 'rollup.node.js']
Match multiple file extensions using a regular expression.
import re
rx = re.compile(r'\.(js|md)')
print filter(rx.search, os.listdir('.'))


['index.js', 'API.md', 'ISSUE_TEMPLATE.md', 'package.json', 'rollup.config.js', ...]
Again, using list comprehension the above can be re-written as follows:
print [x for x in os.listdir('.') if rx.search(x)]

['index.js', 'API.md', 'ISSUE_TEMPLATE.md', 'package.json', 'rollup.config.js', 'CHANGES.md', 'README.md', 'rollup.node

Recursive Descent – Using os.walk()

          The function os.walk() gives a method of posting all sections beginning from a catalog – remembering passages for the sub-indexes. The capacity restores a generator which, on every conjuring, restores a tuple of the current catalog name, a rundown of registries in that index, and a rundown of records. Here is a model:
for x in os.walk('.'):
print x


('.', ['test', '.git', 'img'], ['LICENSE', 'index.js', 'API.md', 'ISSUE_TEMPLATE.md', 'package.json', '.gitignore', 'rollup.config.js', 'CHANGES.md', 'd3.sublime-project', 'README.md', 'rollup.node.js', '.npmignore'])
('./test', [], ['test-exports.js', 'd3-test.js'])
('./.git', ['objects', 'logs', 'info', 'refs', 'hooks', 'branches'], ['packed-refs', 'HEAD', 'description', 'config', 'index'])
('./.git/objects', ['pack', 'info'], [])
...

Recursive File Find

          Here is one usage of recursively discovering documents coordinating the example r'\.(js|md)' beginning from the current directory.
rx = re.compile(r'\.(js|md)')

r = []
for path, dnames, fnames in os.walk('.'):
    r.extend([os.path.join(path, x) for x in fnames if rx.search(x)])
print r
['./index.js', './API.md', './ISSUE_TEMPLATE.md', './package.json', './rollup.config.js', './CHANGES.md', './README.
And here is how you can find files larger than 10000 bytes in the hierarchy.
r = []
for path, dnames, fnames in os.walk('.'):
    r.extend([os.path.join(path, x) for x in fnames if os.path.getsize(os.path.join(path, x)) > 10000])
print r
How about a list of files that have been modified less than 8 hours ago?
r = []
now = int(time.time())
for path, dnames, fnames in os.walk('.'):
    r.extend([os.path.join(path, x) for x in fnames if (now - int(os.path.getmtime(os.path.join(path, x))))/(60*60) < 8])
print r
As should be obvious, it is conceivable to compose different conditions for choosing documents and that makes this technique more remarkable than the shell for discovering records.

Conclusion

          In this article, we figured out how to list the substance of a directory from Python. It is additionally simple to apply conditions for choosing records utilizing the standard Python builds, for example, channel() and rundown perception. At last, we perceived how to process a whole directory chain of command utilizing os.walk(), including composing conditions for discovering records.
  
Like the Blog, then Share it with your friends and colleagues to make this AI community stronger. 
To learn more about nuances of Artificial Intelligence, Python Programming, Deep Learning, Data Science and Machine Learning, visit our insideAIML blog page.
Keep Learning. Keep Growing. 

Submit Review