Page 1 of 1

Python + nconvert

Posted: Thu Apr 09, 2009 9:17 pm
by alexb
I downloaded nconvert from two days, my first look to "help" gave me the neat feel that it has been written in old chinese, but at a fourth-fifth look I understood something. I know by experience that this is pretty common dealing with a new application. :wink:

My goal was, to control into any possiblie detail the conversion pdf -> jpeg that have to be processed with Djvu Solo into djvu files. This procedure is what's needed to publish into http://commons.wikimedia.org a good, compressed file collecting the images of Public Domain books, as a documented source for careful, cooperative transciption into one of many existing Wikisource projects, the "PD library" branch of wiki project.

After some tests from command prompt and from a bat file, I used for the first time the os.system() function of python, that sends a string to OS just like command prompt does.

My first running, and useful, python script is the following:
>>> import os
>>> for i in range(2,6):
os.system("nconvert -out jpeg -o vipere#.jpg -page # -q 95 -dpi 300 -autocrop 20 255 255 255 vipere.pdf".replace("#",str(i)))

that is the same I'd get with four commands from command prompt:
c:> nconvert -out jpeg -o vipere1.jpg -page 1 -q 95 -dpi 300 -autocrop 20 255 255 255 vipere.pdf
c:> nconvert -out jpeg -o vipere2.jpg -page 2 -q 95 -dpi 300 -autocrop 20 255 255 255 vipere.pdf
..... and so on (the python script simply replaces # with 2,3,4,5 while cycling).

Well... IT RUNS! :D
Thanks to XnView and Nconvert creators!

Re: Python + nconvert

Posted: Fri Apr 10, 2009 10:33 pm
by alexb
Here a sample of my second day of tries.

I've this pdf (here the first page) :
Image
As you see, any pdf page contains two pages of a book; the book has 90 pages. My problem is, to convert this pdf into a set of numbered jpg files, one for page of the book; so I've to convert pdf into jpg, to autocrop images, then to crop any image into two images.

Here my rough, but running solution. I prefer to use puthon to write a bat file, to take a look at it before running it. Obviuosly it is far from a well designed program! It's only a try, and it is not generalized at all; i.e., I know that autocropping of converted jpf pages by my script produces a 2600x2100 pixel jpg, and I use this width & height to crop images into two images.

Code: Select all

def opal2(x="test",filepdf="Versi_leopardi.pdf", pagine=92):
    f=open("pyscript.bat","w")    # here I open the bat file where this routine will write nconvert scripts and that  I'll run from command line
    
    for i in range(0,92,2):
        ns="0000"+str(i)
        ns=ns[-3:]
        # template of a nconvert script to convert a page of pdf into jpg and autocrop it
        script="nconvert -out jpeg -o opal-x.jpg -page # -q 80 -autocrop 20 255 255 255 -dpi 300 "+filepdf  
        
        # here I edit the template of the script to put inside the number of pdf page
        scriptcor=script.replace("#",str(i/2))       
        f.write(scriptcor+"\n")      # I write the script into my bat file
        
        # this script will crop the left half of jpg obtained by first script
        script="nconvert -out jpeg -crop 0 0 1300 2100 -o jpg/opal-##.jpg  opal-x.jpg" 
        scriptcor=script.replace("##",ns)
        f.write(scriptcor+"\n")
        
        # next script will crop the right half of jpg obtained by first script
        script="nconvert -out jpeg -crop 1300 0 1300 2100 -o jpg/opal-##.jpg  opal-x.jpg" 
        ns1="0000"+str(i+1)
        ns1=ns1[-3:]
        scriptcor=script.replace("##",ns1)
        f.write(scriptcor+"\n")
        
    f.close()    # I close the bat file
    return
Well, this too runs. :D You'll see the resulting pictures embedded into a djvu file into http://commons.wikimedia.org/wiki/Versi ... pardi.djvu I presume. As you can imagine, I'll edit it a number of times... but in the meantime I'll use it "as it is" when needed.

Re: Python + nconvert

Posted: Sat Apr 11, 2009 9:42 am
by alexb
Another night has gone, I took a look to PIL (Python Imaging Library) which was deeply sleeping into my pc, and I see great perspectives merging PIL routines and Nconvert for mass pdf->images conversion. "Intelligent" cropping can be archived, probably, with PIL routines too. I presume, my previous script is to be reviewed VERY deeply :wink: . Nevertheless, I'll save for sure the idea to use Python to build a .bat file as its output. It's comfortable, and it's safer.

Re: Python + nconvert

Posted: Sun Apr 12, 2009 10:10 pm
by helmut
Thank you, Alex, for sharing your experiene and Python/nconvert/PIL pioneer work! :-)

Re: Python + nconvert

Posted: Mon Apr 13, 2009 6:10 am
by alexb
Here my present version of python-nconvert routines:

Code: Select all

import os
import Image

def opal(filepdf="test.pdf", pagine=10):
    
    # a bat file for nconvert batch commands is opened
    f=open("pyscript.bat","w")

    # name of pdf is extracted 
    nome=filepdf.replace(".pdf","")
    
    # a folder for images is created 
    os.system("md "+nome)
    
    # main loop to write batch nconvert scipts  into pyscript.bat
    for i in range(0,pagine):
        ns=f0(i)
        script="nconvert -out jpeg -o "+nome+"/"+nome+"-"+ns+".jpg -page # -q 80 -dpi 200 -autocrop 20 255 255 255 "+filepdf #
        scriptcor=script.replace("#",str(i))
        f.write(scriptcor+"\n")
    f.close()
    return


def tagliajpg(folder="Gl'ingannati",n=83):
    os.system("md "+folder+"_2")
    # a list of files into a folder of images is created
    lista=os.listdir(folder)

    # loop into the list of files
    for i in range(len(lista)):
        # an image s read
        jpg0=Image.open(folder+"/"+lista[i])
        
        # Image's width and eight is loaded
        xy0=jpg0.size[0]
        xy1=jpg0.size[1]

        # left part of image is cropped and saved (a little more than an half)
        jpg1=jpg0.copy()
        jpg1=jpg1.crop((0,0,xy0*0.6,xy1))
        jpg1.save(folder+"_2/"+folder+"-"+f0(i*2)+".jpg")

        # right part of image is cropped and saved (a little more than an half)
        jpg2=jpg0.copy()
        jpg2=jpg2.crop((xy0*0.4,0,xy0,xy1))
        jpg2.save(folder+"_2/"+folder+"-"+f0(i*2+1)+".jpg")
     
    return 

def f0(n,w=3):    # converts an integer n into a w long string left filled with zeros
    n="0000000"+str(n)
    n=n[-w:]
    return n

Re: Python + nconvert

Posted: Tue Apr 14, 2009 8:29 am
by alexb
In Italy we say "Ogni scarafone รจ bello a mamma sua", thet means "Any beatle is beautiful in the opinion of his mother" :wink: .

Please consider this when reading my scripts... catch the rough idea only, please don't consider them anything more!