Monday, 15 September 2014

linux - python script to copy and extract .gz files -



linux - python script to copy and extract .gz files -

i starting larn python , have question.

how create script following: ( write how in bash)

copy .gz remote server1 local storage. cp /dumps/server1/file1.gz /local/

then extract file locally. gunzip /local/file1.gz

then re-create extract file remote server2 (for archiving , deduplication purposes) cp /local/file1.dump /dedupmount

delete local re-create of .gz file free space on "temporary" storage rm -rf /local/file1.gz

i need run in loop files. files , directories nfs mounted on same server.

a foor loop goes through /dump/ folder , looks .gz files. each .gz file first copied /local directory, , extracted there. 1 time exctarced, unzipped .dmp file copied /dedupmount folder archiving...

just banging head on wall how write this..

please held :d

python solution

while shell code might shorter, whole process can done natively in python. key points in python solution are:

with gzip module, gzipped files easy read normal files.

to obtain list of source files, glob module used. modeled after shell glob feature.

to manipulate paths, utilize python os.path module. provides os-independent interface file system.

here sample code:

import gzip import glob import os.path source_dir = "/dumps/server1" dest_dir = "/dedupmount" src_name in glob.glob(os.path.join(source_dir, '*.gz')): base of operations = os.path.basename(src_name) dest_name = os.path.join(dest_dir, base[:-3]) gzip.open(src_name, 'rb') infile: open(dest_name, 'w') outfile: line in infile: outfile.write(line)

this code reads remote1 server , writes remote2 server. no need local re-create unless want one.

in code, decompression done cpu on local machine.

shell code

for comparison, here equivalent shell code:

for src in /dumps/server1/*.gz base=${src##*/} dest="/dedupmount/${base%.gz}" zcat "$src" >"$dest" done three-step python code

this implements three-step algorithm:

import gzip import glob import os.path import shutil source_dir = "./dumps/server1" dest_dir = "./dedupmount" tmpfile = "/tmp/delete.me" src_name in glob.glob(os.path.join(source_dir, '*.gz')): base of operations = os.path.basename(src_name) dest_name = os.path.join(dest_dir, base[:-3]) shutil.copyfile(src_name, tmpfile) gzip.open(tmpfile, 'rb') infile: open(dest_name, 'w') outfile: line in infile: outfile.write(line)

this copies source file temporary file on local machine, tmpfile, , gunzips there destination file. tmpfile overwritten every invocation of script.

temporary files can security issue. avoid this, place temporary file in directory write-able only by user runs script.

python linux backup folder gunzip

No comments:

Post a Comment