One of the oddities about java programs is that they require you to set a maximum heap size when you start the program. What this means in effect is that you need to be able to predict the memory usage of your program before it starts, and whatever heap size you set needs to be appropriate for all of the system the program is going to run on and all of the datasets it will handle.
When you’re distributing a desktop application which needs to process tens of gigabytes of data this can be a problem. Ideally you’d like to set a heap size of a few gigabytes to give yourself enough overhead to process even the largest of datasets. However not all machines have that much RAM installed, and even if they do they require a 64 bit OS to be able to use more than 2GB of it on a single process.
Up until now we’ve resorted to setting a lowest common denominator heap size (1500M), and providing instructions for increasing this on systems which can handle it. This is however very inelegant and means we have to warn users if they’re running out of memory and make them save, reconfigure and restart the application.
We have now moved over to using a system which dynamically sets the heap size to an appropriate value at runtime. We do this by writing a Perl wrapper which launches the java application after having worked out the most appropriate heap size for the current system.
To do this we have to work out:
- Whether we have a 32 or 64 bit JRE to work with
- The amount of physical RAM in the machine
The heap size is set as 2/3 of the physical RAM which leaves enough overhead for the JRE and basic system processes. In addition we set a ceiling on the heap size. For 32 bit systems this is 1500M which is the most you can practically use given the 2GB per process limit (you have to leave something for the JRE itself). For 64 bit systems we set the ceiling at 6GB. It’s not in our interest to set the heap size too high as this ends up resulting in long freezes during garbage collection, so we set it to the largest size we’re practically going to need. We work out if we’re on a 64-bit system by parsing the output of java -version (it doesn’t matter if the OS is 64 bit if the JRE is still 32 bit).
Finding the amount of phyiscal RAM is a platform specific task. On windows we have to use the Win32 API. Under linux we parse the output of ‘free’, and on OSX and the BSDs we parse the output of top.
If the user doesn’t like our auto-configured heap size we allow them to override it by passing a -m argument to the wrapper script.
For unix-like OSs we don’t need to worry about perl being present, but for windows we compile the script into a windows binary using pp. The Win32::API module is bundled into this binary. On other platforms we don’t need to distribute this since it’s loaded dynamically at runtime only if perl’s $^O variable tells us we’re running under windows. Under OSX we can run the wrapper nicely from the command line. We’re still working out the best way to include this as part of an application bundle.
This isn’t perhaps the cleanest of solutions, but compared to the very manual process we had before it’s a lot easier for the users to get their systems set up optimally.
#!/usr/bin/perl use warnings; use strict; use English; use FindBin qw($RealBin); use Getopt::Long; use IPC::Open3; my @java_args;
# See if they manually set a heap size my $memory; my $result = GetOptions( 'memory=i' => \$memory, ); if ($memory) { if ($memory < 500) { die "Memory allocation must be at least 500M"; } } else { $memory = determine_optimal_memory(); } unshift @java_args,"-Xmx${memory}m"; exec "java",@java_args, "uk.ac.bbsrc.babraham.SeqMonk.SeqMonkApplication"; sub print_error { # We wrap errors like this so we can keep a windows shell window open # so the user can see any errors we generate my ($error) = @_; warn $error; $_ = <STDIN>; exit 1; } sub determine_optimal_memory { # We'll set a ceiling for the memory allocation. On a 32-bit OS this is going # to be 1500m (the max it can safely handle), on a 64-bit OS we won't take more # than 6GB my $max_memory = 1500; # We need not only a 64 bit OS but 64 bit java as well. It's easiest to just test # java since the OS support must be there if you have a 64 bit JRE. my ($in,$out); open3(\*IN,\*OUT,\*OUT,"java -version") or print_error("Can't find java"); close IN; while (<OUT>) { if (/64-Bit/) { $max_memory = 6000; } } close OUT; warn "Memory ceiling is $max_memory\n"; # The way we determine the amount of physical memory is OS dependent. my $os = $^O; my $physical; if ($os =~ /Win/) { $physical = get_windows_memory($max_memory); } elsif ($os =~/darwin/ or $os =~ /bsd/i) { $physical = get_osx_memory($max_memory); } else { $physical = get_linux_memory($max_memory); } warn "Raw physical memory is $physical\n"; # We then set the memory to be the minimum of 2/3 of the physical # memory or the ceiling, whichever is lower. $physical = int(($physical/3)*2); if ($max_memory < $physical) { return $max_memory; } warn "Using $physical MB of RAM to launch seqmonk\n"; return $physical; } sub get_linux_memory { # We get the amount of physical memory on linux by parsing the output of free open (MEM,"free -m |") or print_error("Can't launch free on linux: $!"); while (<MEM>) { if (/^Mem:\s+(\d+)/) { return $1; } } close MEM; print_error("Couldn't parse physical memory from the output of free"); } sub get_osx_memory { # We get the amount of physical memory on OSX by parsing the output of top open (MEM,"top -l 1 -n 0 |") or print_error("Can't get amount of memory on OSX: $!"); my $total_mem = 0; while (<MEM>) { if (/^PhysMem:.*?(\d+)M\s+used,\s+(\d+)M\s+free/) { $total_mem += $1; $total_mem += $2; } } close MEM; unless ($total_mem) { print_error("Could't parse physical memory from the output of top"); } return $total_mem; } sub get_windows_memory { warn "Getting windows physical memory\n"; # This code was adapted from an answer posted by Tom Feiner on # stackoverflow # # http://stackoverflow.com/questions/423797/how-can-i-find-the-exact-amount-of-physical-memory-on-windows-x86-32bit-using-per my ($max_memory) = @_; eval { require Win32::OLE; Win32::OLE->import qw( EVENTS HRESULT in ); 1; } or do { print_error("Couldn't load Win32 module to determine windows memory"); }; my $WMI = Win32::OLE->GetObject( "winmgmts:{impersonationLevel=impersonate,(security)}//./" ) || print_error ("Could not get Win32 object: $OS_ERROR"); my $total_capacity = 0; foreach my $object (in($WMI->InstancesOf( 'Win32_PhysicalMemory' ))) { $total_capacity += $object->{Capacity}; } my $total_capacity_in_mb = $total_capacity / (1024*1024); return $total_capacity_in_mb; }
One Trackback
[…] Dynamically setting the java heap size at runtime […]