Request: gflIsImage function

Discussions on GFL SDK, the graphic library for reading and writing graphic files

Moderators: helmut, XnTriq, xnview

Post Reply
Alessandro

Request: gflIsImage function

Post by Alessandro »

Hi,

to get if a file is an image I currently use gflGetFileInformation() ,
but this function is extremely slow when used over a lot of images, and it seems even slower if the images are big.
Can a function like

Code: Select all

 bool gflIsImage(char *filename);
which only looks for magic headers, be added in a future release of gfllib to perform this work at high speed?

Best regards.
MaierMan
Posts: 78
Joined: Wed Aug 04, 2004 8:32 pm
Contact:

Post by MaierMan »

I recommend naming such a thing "gflLikelyImage" or similar, as this is not actually a check but just a hint.
And this fact should be clearly documented.
Otherwise novice GflSDK users will stumple over this for sure.
User avatar
xnview
Author of XnView
Posts: 46236
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Re: Request: gflIsImage function

Post by xnview »

Alessandro wrote:Hi,

to get if a file is an image I currently use gflGetFileInformation() ,
but this function is extremely slow when used over a lot of images, and it seems even slower if the images are big.
Can a function like

Code: Select all

 bool gflIsImage(char *filename);
which only looks for magic headers, be added in a future release of gfllib to perform this work at high speed?

Best regards.
gflGetFileInformation is slow??? On which format? This function doesn't read picture...
Pierre.
Guest

Post by Guest »

I mainly use png and jpeg files.
it takes about 12 seconds to get if 108 jpegs (about 1300x1000 pixels and about 300kb per image) are images or not.
I let you figure how long it takes to scan a huge set (some thousand) of images, while if i simply use findfirst/findnext the running time is almost inexistent...
User avatar
xnview
Author of XnView
Posts: 46236
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Post by xnview »

Anonymous wrote:I mainly use png and jpeg files.
it takes about 12 seconds to get if 108 jpegs (about 1300x1000 pixels and about 300kb per image) are images or not.
I let you figure how long it takes to scan a huge set (some thousand) of images, while if i simply use findfirst/findnext the running time is almost inexistent...
Strange, gflLoadInformation load only the header of file...
Pierre.
Alessandro

Post by Alessandro »

I've never tested how fast files are opened by the OS, the function read() is indeed faster than fread() and that may help to speedup the read of the header (or maybe could be added in gfllib a function which reads only the magic header, that is the first DWORD), but if the opening of files itself is the culprit of this big slowdown maybe I'll have to use the extensions of filenames only.

Thanks for your reply
User avatar
xnview
Author of XnView
Posts: 46236
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Post by xnview »

Alessandro wrote:I've never tested how fast files are opened by the OS, the function read() is indeed faster than fread() and that may help to speedup the read of the header (or maybe could be added in gfllib a function which reads only the magic header, that is the first DWORD), but if the opening of files itself is the culprit of this big slowdown maybe I'll have to use the extensions of filenames only.

Thanks for your reply
For you, in XnView, list view mode is fast? I use the same function...
Pierre.
Alessandro

Post by Alessandro »

Using a clock and a listview control (in report mode) they both scored 8 seconds on the above mentioned 108 images
MaierMan
Posts: 78
Joined: Wed Aug 04, 2004 8:32 pm
Contact:

Post by MaierMan »

The speed is quite good here :p

Done a small benchmarking tool (single-threaded).
Available from http://celebnamer.celebworld.ws/stuff/test_gflfi_speed/ (+ Source).

Code: Select all

my lil GflSDK getFileInformation() benchmark :p
(C) 2006 Nils Maier - Code subject to BSD-style license
-------------------------------------------------------
Found 2405 files...
accessing all to fill OS HD cache
benchmarking started
benchmark finished

results
-------
files:            2405
size:       1619166.63 kb
size:          1581.22 MB
size:             1.54 GB
avg.s:          673.25 kb/file
time:       7.83861301 secs
avg.t:      0.00325930 secs/file
Sysinfo

AMD 3800+ X2 - 1.5GB DDR RAM (PC400) - Maxtor 6Y200PO (200GB - IDE) - not-that-much fragmented NTFS
Alessandro

Post by Alessandro »

The speed is quite bad here :p
my lil GflSDK getFileInformation() benchmark :p
(C) 2006 Nils Maier - Code subject to BSD-style license
-------------------------------------------------------
Found 310 files...
accessing all to fill OS HD cache
benchmarking started
benchmark finished

results
-------
files: 310
size: 95401.16 kb
size: 93.17 MB
size: 0.09 GB
avg.s: 307.75 kb/file
time: 21.25821382 secs
avg.t: 0.06857488 secs/file
Sysinfo

AMD 2600+ , 1GB DDR 400 , Maxtor 160GB IDE , WinXP
MaierMan
Posts: 78
Joined: Wed Aug 04, 2004 8:32 pm
Contact:

Post by MaierMan »

Alessandro wrote:The speed is quite bad here :p
...
Sysinfo

AMD 2600+ , 1GB DDR 400 , Maxtor 160GB IDE , WinXP
Indeed.

Code: Select all

files:            1573
size:        773770.61 kb
size:           755.64 MB
size:             0.74 GB
avg.s:          491.91 kb/file
time:       7.96920195 secs
avg.t:      0.00506624 secs/file
This benchmark was performed on my system (see above), but on the system partition (part of 2x300GB SATA Maxtor 6L300S0 Raid0)


Another test performed by a buddy. Quite badly fragmented partition.

Code: Select all

iles:             666 
size:        167118.13 kb 
size:           163.20 MB 
size:             0.16 GB 
avg.s:          250.93 kb/file 
time:      10.40281861 secs 
avg.t:      0.01561985 secs/file
SysInfo: pentium 3.2 GHz, XP Pro, 1 GB DDR ram, Seagate 200GB SATA drive (NTFS).
HD is used heavily by another app the same time.

Quite bad too, but still more than 4x better than your machine.

Code: Select all

files:             341  
size:        198468.86 kb  
size:           193.82 MB  
size:             0.19 GB  
avg.s:          582.02 kb/file  
time:       3.24091657 secs  
avg.t:      0.00950415 secs/file
Same machine from above, but this time an "idleing" medium fragmented Segate Baracuda 160 IDE HD (NTFS).

But I found somebody with worse results.

Code: Select all

files:            2061
size:       1112003.57 kb
size:          1085.94 MB
size:             1.06 GB
avg.s:          539.55 kb/file
time:      60.29231928 secs
avg.t:      0.02925392 secs/file
SysInfo: AMD 64 3200, 2 GIG Ram, 200GB SATA HD, XP Pro SP2

Now this last result is an quite interesting one, isn't it.


I somehow see different AVs as the cause for the performance differences (beside usual stuff as hardware and/or fragmentation).
  • I use AntiVir Personal (with active scanning, but *without* active jpeg scanning.)
  • The first buddy (with the still moderate results) is on PC-Cillin (with active jpeg scanning).
  • And the 2nd buddy (with the worst results) uses Symantec Corp. Edition (with active jpeg scanning).
Conclusion
---
hmm...
Post Reply