Debugging Java VM’s JITed code

Recently I’ve run into some misterious Java VM crashes. After narrowing down the crashing code, I’ve ended up with this POC:

package org.axt.cr;

public class Crash {
	@SuppressWarnings("unused")
	private static int val;
	private static double arr[] = new double[2048];
	
	public static void cr() {
		for(int i=0; i<arr.length; i++) {
			arr[i] = Double.NaN;
		}
		val = (arr[0] == 0 ? 1: 2);
	}
	public static void main(String[] args) {
		cr();
		try {Thread.sleep(2000);} catch(InterruptedException e) {}
		cr();
	}
}

It must be run with -XX:CompileThreshold=100 VM option, to enable the JIT compiler to do it’s work “early”. With this setting the ‘cr’ method is JITed during the sleep, and the second time the JITed code gets executed. Note that you can trace what the JIT compiler is doing with the -XX:+PrintCompilation option. Adding some sysout lines, we can see, that the ‘cr’ method is fully executed, and the crash occurs while returning from the method.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=13364, tid=140737327884032
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C
[error occurred during error reporting (printing problematic frame), id 0xb]
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#

It looks like a stack corruption, and has something to do with NaN-s.

After running the POC with different JavaVM-s, it turned out that only 64bit builds are affected since 1.6.0_25. I figured out if I replace the ?: operator with an if-else statement:

if(arr[0] == 0) val = 1; else val = 2;

the bug disappears. First I’ve dumped the java bytecode, to check for differences (Here you can find an overview of the bytecode instructions):

$ javap -c org.axt.cr.Crash

This is the bytecode with the ?: operator:

   [...]
   24:  getstatic       #12; //Field arr:[D
   27:  iconst_0
   28:  daload
   29:  dconst_0
   30:  dcmpl
   31:  ifne    38
   34:  iconst_1
   35:  goto    39
   38:  iconst_2
   39:  putstatic       #24; //Field val:I
   42:  return

And this with the if-else statement:

   [...]
   24:  getstatic       #12; //Field arr:[D
   27:  iconst_0
   28:  daload
   29:  dconst_0
   30:  dcmpl
   31:  ifne    41
   34:  iconst_1
   35:  putstatic       #24; //Field val:I
   38:  goto    45
   41:  iconst_2
   42:  putstatic       #24; //Field val:I
   45:  return

You can see that after the comparison (24-30) the new value for the variable ‘val’ is pushed onto the stack, and ‘putstatic’ is called (#24 is the reference of field ‘val’ in the constant pool). In the first case it is called after the branching is done, in the second case it is called in the branches. Not a big difference. Afterall in the POC, the condition is always false, so exactly the same code get executed in both of the cases:

dcmpl -> ifne -> iconst_2 -> putstatic -> return

It’s time to check the JITed code. JDK has support for dumping the compiled code. You need a disassembler plugin which you can download from Project Kenai’s homepage or you can build your own, it is part of OpenJDK. You need to rename it to ‘hsdis-amd64.so’ and add it to your library path. You will also need some VM options, so our launcher code will look like this:

LD_LIBRARY_PATH=. java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:+PrintCompilation -XX:CompileThreshold=100 -cp . org.axt.cr.Crash

Here is the dumped JITed code (the second block is the interesting part):

Decoding compiled method 0x00007fffed05e1d0:
[...]
  # {method} 'cr' '()V' in 'org/axt/cr/Crash'
  0x00007fffed05e360: callq  0x00007ffff6e53710  ;   {no_reloc}
  0x00007fffed05e365: data32 data32 nopw 0x0(%rax,%rax,1)
  0x00007fffed05e370: mov    %eax,-0x6000(%rsp)
  0x00007fffed05e377: push   %rbp
  0x00007fffed05e378: sub    $0x20,%rsp
  0x00007fffed05e37c: rex mov (%rsi),%ebp
  0x00007fffed05e37f: mov    %rsi,%rdi
  0x00007fffed05e382: movabs $0x7ffff6ed2880,%r10
  0x00007fffed05e38c: callq  *%r10
  0x00007fffed05e38f: movabs $0xc1269410,%r10   ;   {oop('org/axt/cr/Crash')}
  0x00007fffed05e399: mov    0x258(%r10),%edi   ;*getstatic arr
  0x00007fffed05e3a0: mov    0xc(%rdi),%r8d     
  0x00007fffed05e3a4: cmp    %r8d,%ebp
  0x00007fffed05e3a7: jae    0x00007fffed05e511  ;*dastore
  0x00007fffed05e3ad: mov    %ebp,%r9d
  0x00007fffed05e3b0: inc    %r9d               ;*iinc
  0x00007fffed05e3b3: movsd  -0x7b(%rip),%xmm0  
  0x00007fffed05e3bb: movsd  %xmm0,0x10(%rdi,%rbp,8)  ;*dastore
  0x00007fffed05e3c1: cmp    %r8d,%r9d
  0x00007fffed05e3c4: jge    0x00007fffed05e4b2  ;*if_icmplt
  0x00007fffed05e3ca: add    $0x2,%ebp
  0x00007fffed05e3cd: xor    %ecx,%ecx
  0x00007fffed05e3cf: cmp    %ecx,%ebp
  0x00007fffed05e3d1: cmovl  %ecx,%ebp
  0x00007fffed05e3d4: cmp    %r8d,%r9d
  0x00007fffed05e3d7: jae    0x00007fffed05e4f7
  0x00007fffed05e3dd: movsd  %xmm0,0x10(%rdi,%r9,8)  ;*dastore
  0x00007fffed05e3e4: inc    %r9d               ;*iinc
  0x00007fffed05e3e7: cmp    %ebp,%r9d
  0x00007fffed05e3ea: jl     0x00007fffed05e3d4
  0x00007fffed05e3ec: mov    %r8d,%ecx
  0x00007fffed05e3ef: sub    %r9d,%ecx
  0x00007fffed05e3f2: and    $0xfffffff0,%ecx
  0x00007fffed05e3f5: add    %r9d,%ecx
  0x00007fffed05e3f8: cmp    %ecx,%r9d
  0x00007fffed05e3fb: jl     0x00007fffed05e413
  0x00007fffed05e3fd: mov    %r9d,%ebx
  0x00007fffed05e400: jmpq   0x00007fffed05e49a
  0x00007fffed05e405: data32 data32 nopw 0x0(%rax,%rax,1)
  0x00007fffed05e410: mov    %ebx,%r9d
  0x00007fffed05e413: movsd  %xmm0,0x10(%rdi,%r9,8)  ;*dastore
  0x00007fffed05e41a: mov    %r9d,%ebx
  0x00007fffed05e41d: add    $0x10,%ebx         ;*iinc
  0x00007fffed05e420: movslq %r9d,%r11
  0x00007fffed05e423: movsd  %xmm0,0x18(%rdi,%r11,8)
  0x00007fffed05e42a: movsd  %xmm0,0x20(%rdi,%r11,8)
  0x00007fffed05e431: movsd  %xmm0,0x28(%rdi,%r11,8)
  0x00007fffed05e438: movsd  %xmm0,0x30(%rdi,%r11,8)
  0x00007fffed05e43f: movsd  %xmm0,0x38(%rdi,%r11,8)
  0x00007fffed05e446: movsd  %xmm0,0x40(%rdi,%r11,8)
  0x00007fffed05e44d: movsd  %xmm0,0x48(%rdi,%r11,8)
  0x00007fffed05e454: movsd  %xmm0,0x50(%rdi,%r11,8)
  0x00007fffed05e45b: movsd  %xmm0,0x58(%rdi,%r11,8)
  0x00007fffed05e462: movsd  %xmm0,0x60(%rdi,%r11,8)
  0x00007fffed05e469: movsd  %xmm0,0x68(%rdi,%r11,8)
  0x00007fffed05e470: movsd  %xmm0,0x70(%rdi,%r11,8)
  0x00007fffed05e477: movsd  %xmm0,0x78(%rdi,%r11,8)
  0x00007fffed05e47e: movsd  %xmm0,0x80(%rdi,%r11,8)
  0x00007fffed05e488: movsd  %xmm0,0x88(%rdi,%r11,8)
  0x00007fffed05e492: cmp    %ecx,%ebx
  0x00007fffed05e494: jl     0x00007fffed05e410
  0x00007fffed05e49a: cmp    %r8d,%ebx
  0x00007fffed05e49d: jge    0x00007fffed05e4b2
  0x00007fffed05e49f: nop
  0x00007fffed05e4a0: cmp    %r8d,%ebx
  0x00007fffed05e4a3: jae    0x00007fffed05e4fa
  0x00007fffed05e4a5: movsd  %xmm0,0x10(%rdi,%rbx,8)  ;*dastore
  0x00007fffed05e4ab: inc    %ebx               ;*iinc
  0x00007fffed05e4ad: cmp    %r8d,%ebx
  0x00007fffed05e4b0: jl     0x00007fffed05e4a0
  0x00007fffed05e4b2: test   %r8d,%r8d
  0x00007fffed05e4b5: jbe    0x00007fffed05e525
  0x00007fffed05e4b7: movsd  0x10(%rdi),%xmm0   ;*daload
  0x00007fffed05e4bc: mov    $0x1,%r11d
  0x00007fffed05e4c2: mov    $0x2,%r9d
  0x00007fffed05e4c8: ucomisd -0x188(%rip),%xmm0  
  0x00007fffed05e4d0: jnp    0x00007fffed05e4df
  0x00007fffed05e4d6: pushfq 
  0x00007fffed05e4d7: and    $0xffffffffffffff2b,%rsp
  0x00007fffed05e4de: popfq  
  0x00007fffed05e4df: nop
  0x00007fffed05e4e0: cmovne %r9d,%r11d
  0x00007fffed05e4e4: mov    %r11d,0x25c(%r10)  ;*getstatic arr
  0x00007fffed05e4eb: add    $0x20,%rsp
  0x00007fffed05e4ef: pop    %rbp
  0x00007fffed05e4f0: test   %eax,0xaf99b0a(%rip) 
  0x00007fffed05e4f6: retq   
[...]

I’ve increased the sleep time to 20 seconds, so I could easily attach with ‘gdb’. I’ve set a breakpont to the entry point of the JIT-ed function, and traced it line-by-line. The first part of the function executes SharedRuntime::OSR_migration_end, the next few dozens of instruction are responsible for filling the array, and the last part is responsible for evaluating the condition and the branching. I’ve noticed that during the last part the code the stack pointer is changed so at the end retq will read some garbage from the stack and the VM crashes while trying to execute code from an unmapped address. Just for comparison, here is the dump of the if-else code:

  0x00007fffed06c3f7: movsd  0x10(%rdi),%xmm0
  0x00007fffed06c3fc: ucomisd -0x17c(%rip),%xmm0  
  0x00007fffed06c404: jp     0x00007fffed06c408
  0x00007fffed06c406: je     0x00007fffed06c41f  ;*ifne
  0x00007fffed06c408: movl   $0x2,0x25c(%r10)   ;*getstatic arr
  0x00007fffed06c413: add    $0x20,%rsp
  0x00007fffed06c417: pop    %rbp
  0x00007fffed06c418: test   %eax,0xaf8bbe2(%rip) 
  0x00007fffed06c41e: retq   
  0x00007fffed06c41f: movl   $0x1,0x25c(%r10)   ;*putstatic val
  0x00007fffed06c42a: jmp    0x00007fffed06c413

Funny fact: the JITed code of the if-else is using fewer instructions than the code with ?: operator.

The main difference is in these lines:

  0x00007fffed05e4d6: pushfq 
  0x00007fffed05e4d7: and    $0xffffffffffffff2b,%rsp
  0x00007fffed05e4de: popfq  

We can guess that this lines intended to fixup the flags after the ucomisd instruction, after the result of the comparison is ‘unordered’ (at least one of the operands is NaN or Inf), which is indicated through the parity flag. But instead of operating on the flags register , the code operates on the stack pointer. The good code should be like this:

  0x00007fffed05e4d6: pushfq 
  0x00007fffed05e4d7: and    $0xffffffffffffff2b,(%rsp)
  0x00007fffed05e4de: popfq  

After hunting down the problem, I’ve tried to search for the bug, and found it in Oracle’s bugzilla, and the fix also in OpenJDK7. Grepping through the source it turned out, the following “code blobs” are affected:

instruct cmpF_cc_reg(rFlagsRegU cr, regF src1, regF src2)
instruct cmpF_cc_mem(rFlagsRegU cr, regF src1, memory src2)
instruct cmpD_cc_reg(rFlagsRegU cr, regD src1, regD src2)
instruct cmpD_cc_mem(rFlagsRegU cr, regD src1, memory src2)

In conclusion we can see, that the error is caused by wrongly JITed double or float comparison, the for loop was only needed to trigger the JIT compilation.

If we add ‘-Xcomp’ VM option to the command line, which causes a pre-init JIT compilation of all the functions, we can trigger the bug with the following simplified POC too:

package org.axt.cr;

public class Cr2 {
	@SuppressWarnings("unused")
	private static int val;
	
	public static void cr(double x) {
		val = (x == 0 ? 1: 2);
	}

	public static void main(String[] args) {
		cr(Double.NaN);
	}
}

Four questions left:

  • why we don’t need the cleanup code in the if-else case? is it normal, or is it a bug?
  • can we turn this into an infoleak?
  • can we control the stack to achieve code execution?
  • why Oracle didn’t fix this in the last 10 months?

I will try to investigate the first three questions, and hopefully you will see the results in a followup post.

Advertisements
Posted in Bugs | Tagged , , , , , , | 2 Comments

Benchmark: Rhino vs Chrome V8 on server side

Since this post received a lot of criticism about the measurement method, I’ve created a revised version from it.

Here you can continue with the old post:
Continue reading

Posted in Benchmark | Tagged , , , , | 13 Comments

Analysis of ZDI-11-036, DB2DAS “ping of death”

In this post I try to analyse/reverse engineer ZDI-11-036, a buffer overflow, in the IBM DB2 administration service.

We can check the advisory at ZDI for information about the vulnerability:

The flaw exists within the db2dasrrm component which listens by default on TCP port 524. When allocating a buffer within receiveDASMessage a user supplied length is used as a parameter to malloc(). This buffer is later copied into without any bounds checking and can be made to overflow. A remote attacker can exploit this vulnerability to execute arbitrary code under the context of the das user user.

Continue reading

Posted in Security | Tagged , , , | 2 Comments

Examining native bugs fixed by Java 1.6.26

Yesterday Oracle has released a Critical Patch Update for Java SE. It fixes a few bugs that lead to code execution. I’ve heard rumours about exploit wednesday, which happens after patch tuesday, when hackers start to diff the fixpacks, to discover/examine the fixed vulnerabilites. I was curious, what were the vulnerabilites fixed by this update.
Continue reading

Posted in Security | Tagged , , , , | 1 Comment

Retro: The “snail-line” challenge

While OTW Vortex/12 is under way, another highschool retro I’ve found. The original challenge was to draw out a 200×200 spiral as you can see it on the picture. In hungarian spiral is csigavonal, where csiga means snail and vonal means line. It was necessary, to do it in 320×200 video mode, and wait for a keypress before switching back to textmode and exit. Smaller bytecode wins!

Snail in asm

Continue reading

Posted in Retro | Tagged , , , | Leave a comment

Retro: Highschool reversing (’97)

I’ve stumbled upon a lots of interesting stuff this weekend … nothing special, just some emotional retro stuff.

This piece of code got me 2 month ban from the high school’s PC-lab :-). The interesting in it is, that it is 14 years old, and it is written in the first language, I’ve learned: Pascal. (currently I have no idea, what {$i+} or {$i-} are meaning).

Ok, what is this? This tool could turn on/off the ports of the SMC switches/hubs used by our highscool.

Continue reading

Posted in Retro | Tagged , , , | Leave a comment

Overthewire Vortex Level11, BSD Heap Smashing

It was a while ago, when I posted my last OTW solution. The reason is simple, I’ve ordered A Guide to Kernel Exploitation: Attacking the Core, a great book from twiz and sgrakkyu. You can also check the book’s website. Funny fact is that this level has a slight binding to the book: Smashing phkmalloced heap is similar to exploiting the slab allocator in kernel. If you are interested, you can check this great article about slab allocator exploitation (among other kernel exploitation methods) from the same authors as above.

Let’s go back to our current topic: BSD heap smashing. I’ve checked out all the listed articles, but finally I’ve read this one. (The hosting site has a lot of interesting stuff also.)

Continue reading

Posted in Wargame | Tagged , , | 1 Comment