lucy-vray_28_mil_poly_hdri_gi

The are many considerations when choosing a storage format for 3D object data including storage size, flexibility, performance, portability just to get started. Herein we cover some of the comparisons with real-world models to quantify the differences.  Significant portiosn of literature target components of 3D shape storage (organization of the polygon, efficient striping algorithms, et cetera).


The Aim@SHAPE Shape Repository is a vast and growing resource for a wide variety of 3D shapes from artistic through machined parts stored in a wide variety of formats.  This is a valuable resource for research models of realstic and artificial objects including many gathered from 3D scanning technology. As can be seen on the right, many of these are designed to be "visual" models.  In analysis and simulations, the models are required to be "exact" which creates a more rigorous requirement for tools.  Some of the challenges will be covered below.

male01_amale01_bmale01_cmale01_d
male01_emale01_fmale01_gmale01_h

Format Comparisons

Available Data


With the wide variety of formats available comes a wide range of capabilities. Some formats permit color/material/appearance within the file structure at a per object, per face and per vertex level. Yet other formats permit storage of vertex normals, face normals, or not, as the model creator needs.

The underlying difficulty comes when comparing one format to another. The same geometric model could have vertex normals, and per vertex colors/materials assigned in one file with none in a second file yielding, at times significant, differences in the disk storage and memory requirements. Even beyond the format flexibility, the user may not require colors/materials or vertex normals for the particular processing they are performing.

One compression strategy utilized commonly in OpenGL is loops, strips or fans. In the case of N triangular faces, this could require N*3 vertices, or indexed vertices. In the strip concept, after the first triangle, we may only need one more vertex or index. This could be a relative or actualized vertex. Relative may suffer from object space drift yielding openings that are not real. In practical situations, the compression is not perfect as strips may not "fit" the entire geometry. Thus, on any given geometry, multiple strips will likely be required. Loops, strips and fans can offer a huge reduction in storage space, however this may be at the expense of operational speed. If the code needs to access a face, for example, ready access to the vertices (or vertex indices) is not there, but must be calculated based on the face position in the strip.

One comparison strategy is the overall byte size of the resulting file which can relate to transmission time over networks and storage space on disks. This can also be correlated to the # of vertices, # of faces and # of edges in the model as Bits Per Vertex (bpV), Bits per Face (bpF), Bits per Edge (bpE).

By all means, this is not the only comparison possible. One technique is to separate the "geometry" or vertices from the "connectivity" or triangles, polygons, et al. Then the bits per vertex (bpV) are quoted for the "geometry" storage, alone.

In order to maintain the original authors connectivity of the entire geometry, for CEM models, we must maintain the "geometry" and "connectivity". Thus, comparisons below are bits per vertex (bpV) where bits are calculated against the entire file size, for instance.

Lossy, lossless

Is the compression strategy lossy or lossless? Does it maintain ALL of the original vertices, faces, edges and connectivity? Does it sacrifice one of these in order to achieve compression? Do you even care?

For example, if your original processes and data are contained within a CAD package and the 3D model you are working on is only for "viewing", then it is perfectly reasonable to use lossy compression and model simplification as this is not the "original" working copy. However, on the other hand, if this 3D model IS your working copy, then lossy compression may suffer degradation after numerous read, change/modify, save cycles. Similarly, if you have connectivity that IS important, then utilizing a storage format that disregards connectivity may cause difficulties later on.

Binary Considerations

Binary can be significantly more compact than ASCII while maintaining precision. Consider an ASCII representation of a number (207.707047 or -90.335808, or -823.182922) requiring 10-11 ASCII characters/bytes. Internally, float precision numbers require 4 bytes while double precision works with 8 bytes. Note we are not considering compression, in any form, at this stage.

However, one must be cautious of endian nature and storage for later processing by others. If we could only read the data, process it on our own computer and never concern ourselves with portability! In reality, portability is a necessity even if the data will never be shared with others as it is highly likely the computer you are working with today is not the same as a few years ago.

Intel processors show as Little Endian, while MIPS show as Big Endian. What\\''s the difficulty? When migrating the dataset out of core computer memory to disk file storage, one could simply use fread or fwrite with the appropriate buffer size on the float, int, or double values. However, pending where you are writing from (big endian, for example) and reading from (little endian, at your collaborators office), the data can end up reversed yielding much confusion for parsing and processing front-ends on codes.

ASCII Considerations

ASCII files are imminently portable, albeit generally not the most efficient in terms of file-size. One challenge with ASCII is the efficient means to describe the data regions (vertices, vertex normals, face normals, triangles, et al) with clarity, flexibility and minimum overhead. This may be one reason why such a wide range of formats is available today.

While ASCII may not have to be concerned with Endian nature, another issue rapidly arises during portability especially after several programs have operated on the data set. For example, the original data set may have reasonable precision:

OFF
2003932 4007872 0
207.707047 -90.335808 -822.698364
209.390869 -90.865875 -823.182922
209.308121 -92.984406 -821.091553
...
and after rounds of processing (read in, process, write out) may become
v -71.3479 -599.562 -1003.49
v -25.2006 77.0132 -994.086
v -168.207 -591.965 -1102.19
v -105.306 -595.42 -1057.16
v 7.39791 -590.529 -1140.46
...

Thus, caution should also be exercised even with ASCII formatted files. This is exceedingly important when the dataset is not just for looks (overlapping triangles, coincident vertices, collinear edges and lines, to be clarified later). Models may "look" perfect and yet have significant problem areas when utilized in a numerical analysis code.

Artistic Models

Neptune

803_neptune_4Mtriangles_manifold1

Full scale statue scanned in 3D and stored in multiple resolutions at Aim@SHAPE Shape Repository

 

Storage

2,003,932 vertices, 4,007,872 faces, 6,011,808 edges [32bit, i686, Linux 2.6.22.17] [bzip2 -9 Version 1.0.4, 20-Dec-2006.] [gzip -9 1.3.12]
Size (bytes)FormatbpVbpFbpE
168 960 696ASCII 803_neptune_4Mtriangles_manifold.off674.52337.26224.84
54 274 024Binary 803_neptune_4Mtriangles_manifold.off.gz216.67108.3372.22
51 854 005Binary cemtach (double-precision)207.01103.569
48 111 300Binary 803_neptune_4Mtriangles_manifold.off.bz192.0796.0364.02
37 683 468Binary cemtach (float-precision)150.4475.2250.15

Lucy

lucy_01lucy_02lucy_02lucy_02lucy_02

Full scale statue scanned in 3D The Stanford 3D Scanning Repository

Source: Stanford University Computer Graphics Laboratory
Scanner: Stanford Large Statue Scanner
Number of scans: 47
Total size of scans: 58,241,932 points (approx 116 million triangles)
Reconstruction: vrip at 0.5 mm, holefilling
Size of reconstruction: 14,027,872 vertices, 28,055,742 triangles
Comments: hole-free, but contains small bridges due to space carving, so its topological genus is larger than it appears.
It may also have a few topological problems, making it not a proper manifold. Thanks to the Chaos Group for the rendering above.
ply
format binary_big_endian 1.0
element vertex 14027872
property float x
property float y
property float z
element face 28055742
property list uchar int vertex_indices
end_header

Storage

14,027,872 vertices, 28,055,742 triangles [32bit, i686, Linux 2.6.22.17] [bzip2 -9 Version 1.0.4, 20-Dec-2006.] [gzip -9 1.3.12]
Size (bytes)FormatbpVbpFbpE
533 059 290Binary lucy.ply304152
306 117 166Binary lucy.ply.gz174.5887.29
291 354 147Binary lucy.ply.bz166.1683.08
287 997 041Binary cemtach (double-precision)164.2482.12
164 161 465Binary cemtach (float-precision)93.6246.81
10,072,906 vertices, 20,145,810 triangles, Note that the QSplat model is slightly different in size with ~4 million fewer vertices. [32bit, i686, Linux 2.6.22.17] [bzip2 -9 Version 1.0.4, 20-Dec-2006.] [gzip -9 1.3.12]
Size (bytes)FormatbpVbpFbpE
66 579 696Binary lucy.qs (QSplat}52.8826.44
52 941 289Binary lucy.qs.gz (QSplat}42.0521.02
51 409 140Binary lucy.qs.bz (QSplat)40.8320.41

Bunny

Easily one of the most recognizable 3D models available, accessible and studied today.

bunny01bunny02

 


ply
format ascii 1.0
comment zipper output
comment modified by flipply
element vertex 35947
property float32 x
property float32 y
property float32 z
property float32 confidence
property float32 intensity
element face 69451
property list uint8 int32 vertex_indices
end_header
-0.0378297 0.12794 0.00447467 0.850855 0.5
-0.0447794 0.128887 0.00190497 0.900159 0.5
-0.0680095 0.151244 0.0371953 0.398443 0.5
...
-0.0310262 0.153728 -0.00354608 0.167698 0.5
-0.0400442 0.15362 -0.00816685 0.734503 0.5
3 20399 21215 21216
3 14838 9280 9186
3 5187 13433 16020
3 5187 16020 16021
...
The OFF formatted file indicates 104 288 unique edges and slightly fewer vertices (34834) than the original PLY formatted file while keeping the same count on faces (69451).
OFF
34834 69451 104288
-0.0955222 0.282596 -0.0386288
-0.154962 0.290696 -0.0606118
-0.353673 0.481931 0.241248
0.208502 0.3015 0.12171
...
-0.0373229 0.503179 -0.107238
-0.11446 0.502255 -0.146765
3 20462 19669 20463
3 8935 14299 8845
...
The OBJ formatted example file contains texture coordinates.
# Texture coordinates generated with:
# ./geomstretchParam -r -ll2 /tmp/bunny_c.obj /tmp/bunny_cr.obj 0 0.1
# OBJ File
# n_vertices 34834
# n_faces 69451
# n_edges 104288
v -0.03782970086 0.1279399991 0.004474670161
v -0.04477940127 0.1288869977 0.001904970035
v -0.06800950319 0.1512439996 0.03719529882
v -0.002287409967 0.1301500052 0.02322009951
...
v -0.01612100005 0.03479310125 0.04553649947
v -0.01517110039 0.03467240185 0.04584569857
v -0.01413749997 0.03470959887 0.04603559896
v -0.01314519998 0.03466780111 0.04627909884
vt 0.04194357246 0.03299826384
vt 0.04065941274 0.03355989978
vt 0.04035732523 0.03835748509
vt 0.05005059764 0.03035461158
...
vt 0.1202315986 0.1167642921
vt 0.1226822063 0.1113534793
f 21217/21217 21216/21216 20400/20400
f 9187/9187 9281/9281 14839/14839
f 16021/16021 13434/13434 5188/5188
f 16022/16022 16021/16021 5188/5188
...
[32bit, i686, Linux 2.6.22.17] [bzip2 -9 Version 1.0.4, 20-Dec-2006.] [gzip -9 1.3.12]
Size (bytes)FormatbpVbpFbpE
5 213 426ASCII bunny-conformal.obj1197.32600.53399.93
4 414 842ASCII bunny.iv982.52508.54338.67
3 033 235ASCII bunny.ply675.05349.4232.68
2 639 846ASCII bunny.off606.27304.08202.5
2 586 359ASCII bunny.sma593.98297.92198.4
1 731 249Binary bunny-conformal.obj.gz397.6199.42132.81
1 524 997Binary bunny-conformal.obj.bz2350.23175.66116.98
1 277 969Binary bunny.smb293.5147.2198.03
1 050 038Binary bunny.ply.gz233.69120.9580.55
1 002 384Binary cemtach (double-precision)223.08115.4676.89
955 090Binary bunny.sma.gz219.35110.0273.27
948 200Binary bunny.ply.rz (rzip -9)211.02109.2272.74
946 825Binary bunny.ply.bz210.72109.0672.63
921 913Binary bunny.iv.gz205.17106.1970.72
883 518Binary bunny.sma.bz202.91101.7767.78
877 280Binary bunny.off.gz201.48101.0567.3
838 032Binary bunny.ply.7z (p7zip -9)186.596.5364.29
836 967Binary bunny.ply.lzma (lzma -9)186.2796.4164.2
803 008Binary bunny.smb.bz184.4292.561.6
797 749Binary bunny.iv.bz2177.5491.8961.2
792 656Binary bunny.smb.gz182.0491.3160.81
783 923Binary bunny.off.bz180.0490.360.14
678 551Binary bunny.ply.paq_1.paq8o10t155.8478.1652.05
596 572Binary bunny.ply.paq_7.paq8o10t137.0168.7245.76
490 496Binary bunny.off.paq_7.paq8o10t112.6556.537.63
437 395Binary cemtach (float-precision)97.3450.3833.55

Mechanical Models

Crank

crank_off

With 50012 vertices, 100056 faces and 2 header lines, this OFF format stores in 150070 ASCII lines.

OFF
50012 100056 150084
-0.236683 -0.105617 -0.927022
-0.225231 -0.105388 -0.916759
-0.229387 -0.11809 -0.927022
-0.323651 -0.175749 -0.927022
-0.332386 -0.16127 -0.927022
...
3 1241 1240 1186
3 777 745 1187
3 1156 1129 1176
3 818 1120 790
3 721 842 1156
...

Storage

50012 vertices, 100056 faces, 150084 edges [32bit, i686, Linux 2.6.22.17] [bzip2 -9 Version 1.0.4, 20-Dec-2006.] [gzip -9 1.3.12]
Size (bytes)Format
3 817 495ASCII crank.off
1 028 155Binary cemtach (double-precision)
1 009 659Binary crank.off.gz
924 217Binary crank.off.bz
596 966Binary cemtach (float-precision)

Renault TRM 2000

Renault_TRM_2000

Storage

118533 Points, 178421 Polygons [32bit, i686, Linux 2.6.22.17] [bzip2 -9 Version 1.0.4, 20-Dec-2006.] [gzip -9 1.3.12]
Size (bytes)Format
9 782 320Binary Renault_TRM_2000.lwo
4 211 144Binary Renault_TRM_2000.lwo.gz
4 015 955Binary Renault_TRM_2000.lwo.bz2
1 876 226Binary cemtach (double-precision)
1 123 202Binary cemtach (float-precision)

Blade


A turbine blade with exterior and interior features Level of Detail for 3D Graphics and Large Geometric Models Archive at Georgia Tech. Note the Inventor file contains materials while the ply formatted file does not. Both ply and Inventor utilize indexed face sets and "full" face definitions (not loops, not strips) Note the truncated vertex precision in both ASCII formats yields a significant compression improvement.

fin_2008_01_10afin_2008_01_10bfin_2008_01_10cfin_2008_01_10dfin_2008_01_10e
fin_2008_01_10ffin_2008_01_10gfin_2008_01_10hfin_2008_01_10ifin_2008_01_10j
fin_2008_01_10kfin_2008_01_10lfin_2008_01_10mblade

882954 vertices, 1765388 faces, 882954 normals and 2648354 ASCII lines.

ply
format ascii 1.0
element vertex 882954
property float x
property float y
property float z
property float nx
property float ny
property float nz
element face 1765388
property list uchar int vertex_indices
end_header
-125 -3.85294 361 0.427604 0.894099 -0.1332
-124.773 -4 361 0.607385 0.765683 -0.211693
-125 -4 360.375 0.553533 0.804079 -0.21693
-125 -3.73394 362 0.372673 0.926908 -0.0442247
-124.5 -4 362 0.553125 0.823008 -0.129273
...
3 882952 882953 865800
3 882952 865800 865799
3 882953 865751 865800
#Inventor V2.1 ascii

Separator {

Coordinate3 {
point [
-125 -3.85294 361,
-124.773 -4 361,
-125 -4 360.375,
-125 -3.73394 362,
-124.5 -4 362,
...
-387 -600.086 248,
-387 -600.01 250,
-387 -600.068 251,
]
}

Normal {
vector [
0.427604 0.894099 -0.1332,
0.607385 0.765683 -0.211693,
0.553533 0.804079 -0.21693,
...
-0.628411 -0.77695 0.038057,
-0.67661 -0.735907 -0.0252881,
-0.656211 -0.754325 0.0195199,
]
}

Material {
diffuseColor [
1 1 1
]
}

MaterialBinding { value PER_VERTEX_INDEXED }

IndexedFaceSet {
coordIndex [
2, 1, 0, -1,
4, 3, 0, -1,
1, 4, 0, -1,
...
865749, 865799, 882952, -1,
865800, 882953, 882952, -1,
865799, 865800, 882952, -1,
865800, 865751, 882953, -1,
]
materialIndex [
0, 0, 0, -1,
0, 0, 0, -1,
0, 0, 0, -1,
0, 0, 0, -1,
...
0, 0, 0, -1,
0, 0, 0, -1,
]
}

}

Storage

882954 vertices, 1765388 faces, 882954 normals [32bit, i686, Linux 2.6.22.17] [bzip2 -9 Version 1.0.4, 20-Dec-2006.] [gzip -9 1.3.12]
Size (bytes)Format
136 036 122Binary blade.iv
83 954 806ASCII blade.ply
24 825 673Binary blade.iv.gz
24 758 855Binary blade.ply.zip
24 613 539Binary blade.ply.gz
23 381 490Binary blade.ply.bz
22 498 482Binary blade.iv.bz
14 534 634Binary cemtach (double-precision, no normals)
10 820 345Binary cemtach (float-precision, no normals)