Wednesday, March 13, 2013

size of Object.java, and how does it add up?

(Hypotheses and Personal opinion only. I'm no authority.)

Minimum info to be stored in an Object.java instance --

- (4-byte) wait-set as a hidden field -- to hold the waiting threads. Expandable set, so perhaps a WS_pointer to a linked list
** It's probably inefficient to try to "save" memory by building a static VM-wide lookup table of {instance-address, wait-set}. As this table grows, such hashtable lookup is going to impact the most time-critical operations in JVM. Not having this lookup means minimum overhead locating the wait-set.
** this particular WS_pointer starts out as null pointer
** Note c# value types don't have wait-set. c# Reference types do. The sync-block takes 4-bytes
** why linked list? Well, Arrays can't grow. Vector involves re-allocation.

- How about the data structure holding the set of threads blocked in lock() or synchronized keyword? A mutex "contender set" associated with each Object.java instance? Apparently there's no such thing mentioned in popular literature. Is it possible to put these contenders in the wait-set but use a flag to distinguish?

- (4-byte) vptr as a hidden field -- If you try to be clever and put this ptr as the first element of the wait-set, then every Object instance still must occupy 64bits == WS_pointer + 1st element in the wait set. Therefore it's faster to store the vptr directly in the Object instance

- ? pointer to the class object as a hidden (static) field -- representing the Runtime type of the object.
** This can be stored in the per-class vtbl, as C++ does. This hidden field occupies 32 bits per Class, not per instance.
** I believe myObject.getClass() would use this pointer.
** See my blog post on type info stored inside instances (http://bigblog.tanbin.com/2012/02/type-info-stored-inside-instances-c-c.html).

- hashcode -- Once generated, hashcode must not mutate (due to garbage collection relocation), until object state changes.
** I feel this can live on the wait-set, since most Object instances don't get a hashcode generated.
** Why not use another hidden field? Well, that would require 32 bits dedicated real estate per object, even if an object needs no hashcode in its lifetime.
** Note python hashcode never mutates. See separate blog post.

?? this-pointer as a hidden field -- to hold "my" own address. Remember this.m(...) is compiled to m(this, ...) to let m() access fields of the host object.
** However, I believe this-pointer is possibly added (as a hidden field) only when you subclass Object.java and introduce a field, and then a method referencing that field. There's no need to add this-pointer to classes whose instance methods never access instance fields. Such an object doesn't (need to) know it's own address -- like a lost child.
** even in those "other" cases, I feel it's possible for the compiler to remove this-pointer as a field (reducing memory footprint) because compilers implicitly translate myInstance.m1() to m1(&myIntance)

In conclusion, I'd guess minimum size = 8 bytes but will grow when hashcode generated.

Q1a: min size of a c# struct or any value type instance?
%%A: C# structs need zero overhead, so an Int32 instance needs just 32 bits.
%%A: 
When we call myStruct.m2(), there's no vptr involved -- just a compile-time resolved function call as in C and non-OO languages. Compiler translates it to some_Free_Function(ptr_to_myStruct). Note even for this value-type there's no pass-by-value -- no passing at all. Just translation by compiler.

Q1b: min size of c# System.Object instance.
A: Supposed to be 8 bytes minimum (A value type instance has no such overhead)
* 4-byte vptr
* 4-byte sync block
A real instance seem to be 12-bytes long.

Q2: why is the minimum size of an empty c++ object no more than 1 byte?
%%A: obvious from the analysis above.

Q2b: why not 0 byte?
A: an object is defined as a storage location. Even if there's no field in it, a=new MyObject() and b=new MyObject() (twice in a row, single-threaded program) must produce 2 objects at 2 locations.

Note the size of an empty c++ string class instance is 12 bytes, according to [[c++ primer]]

According to P89 [[C# in depth]] a C# byte takes 1 byte, but 8+1 bytes of heap usage when "boxed". These 9 bytes are rounded up to 12 bytes (memory alignment). Since heap objects are nameless and accessed only via a pointer, the pointer becomes a field (of the boxed object) and takes 4 bytes (assuming a 32-bit machine)

No comments:

Total Pageviews

my favorite topics (labels)

_fuxi (302) _misLabel (13) _orig? (3) _rm (2) _vague (2) clarified (58) cpp (39) cpp_const (22) cpp_real (76) cpp/java/c# (101) cppBig4 (54) cppSmartPtr (35) cppSTL (33) cppSTL_itr (27) cppSTL_real (26) cppTemplate (28) creditMkt (14) db (65) db_sybase (43) deepUnder (31) dotnet (20) ECN (27) econ/bank` (36) fin/sys_misc (43) finGreek (34) finReal (45) finRisk (30) finTechDesign (46) finTechMisc (32) finVol (66) FixedIncom (28) fMath (7) fMathOption (33) fMathStoch (67) forex (39) gr8IV_Q (46) GTD_skill (15) GUI_event (30) inMemDB (42) intuit_math (41) intuitFinance (57) javaMisc (68) javaServerSide (13) lambda/delegate (22) marketData (28) math (10) mathStat (55) memIssue (8) memMgmt (66) metaProgram` (6) OO_Design (84) original_content (749) polymorphic/vptr (40) productive (21) ptr/ref (48) py (28) reflect (8) script`/unix (82) socket/stream (39) subquery/join (30) subvert (13) swing/wpf (9) sysProgram` (16) thread (164) thread_CAS (15) thread_cpp (28) Thread* (22) timeSaver (80) transactional (23) tune (24) tuneDB (40) tuneLatency (30) z_ajax (9) z_algoDataStruct (41) z_arch (26) z_arch_job (27) z_automateTest (17) z_autoTrad` (19) z_bestPractice (39) z_bold (83) z_bondMath (35) z_book (18) z_boost (19) z_byRef^Val (32) z_c#GUI (43) z_c#misc (80) z_cast/convert (28) z_container (67) z_cStr/arr (39) z_Favorite* (8) z_FIX (15) z_forex (48) z_fwd_Deal (18) z_gz=job (33) z_gzBig20 (13) z_gzMgr (13) z_gzPain (20) z_gzThreat (19) z_hib (19) z_IDE (52) z_ikm (5) z_IR_misc (36) z_IRS (26) z_javaWeb (28) z_jdbc (10) z_jobFinTech (46) z_jobHunt (20) z_jobRealXp (10) z_jobStrength (15) z_jobUS^asia (27) z_letter (42) z_linq (10) z_memberHid` (11) z_MOM (54) z_nestedClass (5) z_oq (24) z_PCP (12) z_pearl (1) z_php (20) z_prodSupport (7) z_py (31) z_quant (14) z_regex (8) z_rv (38) z_skillist (48) z_slic`Problem (6) z_SOA (14) z_spring (25) z_src_code (8) z_swingMisc (50) z_swingTable (26) z_unpublish (2) z_VBA/Excel (8) z_windoz (17) z_wpfCommand (9)

About Me

New York (Time Square), NY, United States
http://www.linkedin.com/in/tanbin