Sunday, August 22, 2010

kill a runaway/stuck thread using SureStop

Suppose you suspect your thread is stuck in a nonresponsive I/O call, or a super slow and unnecessary compute job, or more importantly a *deadlock*, you need to restart JVM. However, in some contexts you have a lot of other healthy threads doing useful work halfway, so you don't want to lose those. You want to terminate that thread.

First get a thread dump and try to detect the deadlock cycle by inspection. In "some" cases you can see signs of deadlock, so how do you kill a thread that's holding a lock?

Well, usually the lock is held for a reason, often to prevent access to a guarded object in an inconsistent state. In that case you don't want to risk your system *integrity* by killing the thread and exposing the inconsistent object.

Suppose you know it's ok and want to kill the thread, here's one proposal.

A server often has a GUI admin console showing the live threads. Thread objects register in a static lookup table, so the GUI can monitor and show their status, perhaps by JMX. On GUI, hit "kill" on the target thread. Now the admin thread will get the Thread object from the lookup table and use the SureStop utility to kill it.

Another solution -- remote debugging with authentication. This gives you direct access to suspend any thread or terminate it.

No comments:

Total Pageviews

my favorite topics (labels)

_fuxi (302) _misLabel (13) _orig? (3) _rm (2) _vague (2) clarified (58) cpp (39) cpp_const (22) cpp_real (76) cpp/java/c# (101) cppBig4 (54) cppSmartPtr (35) cppSTL (33) cppSTL_itr (27) cppSTL_real (26) cppTemplate (28) creditMkt (14) db (65) db_sybase (43) deepUnder (31) dotnet (20) ECN (27) econ/bank` (36) fin/sys_misc (43) finGreek (34) finReal (45) finRisk (30) finTechDesign (46) finTechMisc (32) finVol (66) FixedIncom (28) fMath (7) fMathOption (33) fMathStoch (67) forex (39) gr8IV_Q (46) GTD_skill (15) GUI_event (30) inMemDB (42) intuit_math (41) intuitFinance (57) javaMisc (68) javaServerSide (13) lambda/delegate (22) marketData (28) math (10) mathStat (55) memIssue (8) memMgmt (66) metaProgram` (6) OO_Design (84) original_content (749) polymorphic/vptr (40) productive (21) ptr/ref (48) py (28) reflect (8) script`/unix (82) socket/stream (39) subquery/join (30) subvert (13) swing/wpf (9) sysProgram` (16) thread (164) thread_CAS (15) thread_cpp (28) Thread* (22) timeSaver (80) transactional (23) tune (24) tuneDB (40) tuneLatency (30) z_ajax (9) z_algoDataStruct (41) z_arch (26) z_arch_job (27) z_automateTest (17) z_autoTrad` (19) z_bestPractice (39) z_bold (83) z_bondMath (35) z_book (18) z_boost (19) z_byRef^Val (32) z_c#GUI (43) z_c#misc (80) z_cast/convert (28) z_container (67) z_cStr/arr (39) z_Favorite* (8) z_FIX (15) z_forex (48) z_fwd_Deal (18) z_gz=job (33) z_gzBig20 (13) z_gzMgr (13) z_gzPain (20) z_gzThreat (19) z_hib (19) z_IDE (52) z_ikm (5) z_IR_misc (36) z_IRS (26) z_javaWeb (28) z_jdbc (10) z_jobFinTech (46) z_jobHunt (20) z_jobRealXp (10) z_jobStrength (15) z_jobUS^asia (27) z_letter (42) z_linq (10) z_memberHid` (11) z_MOM (54) z_nestedClass (5) z_oq (24) z_PCP (12) z_pearl (1) z_php (20) z_prodSupport (7) z_py (31) z_quant (14) z_regex (8) z_rv (38) z_skillist (48) z_slic`Problem (6) z_SOA (14) z_spring (25) z_src_code (8) z_swingMisc (50) z_swingTable (26) z_unpublish (2) z_VBA/Excel (8) z_windoz (17) z_wpfCommand (9)

About Me

New York (Time Square), NY, United States
http://www.linkedin.com/in/tanbin