[one-users] problems uploading large images

Claude Noshpitz cnoshpitz at attinteractive.com
Tue Aug 10 16:29:16 PDT 2010


Hello,

I'm having trouble using either econe-upload or oneimage to push images
larger than the amount of physical RAM on the ONE master.  It appears that
an attempt is made to buffer the whole image content in memory somewhere
along the line; odd messages like this appear:

tcmalloc: large alloc 2181038080 bytes == 0x187ab4000 @
tcmalloc: large alloc 18446744072140881920 bytes == (nil) @
tcmalloc: large alloc 4362076160 bytes == 0x209eb4000 @

Not good.  So the first attack was to run econe-server under Unicorn to
properly buffering the POSTed "attachment" to disk before trying to complete
the request.  There are obviously other ways to do this, but Unicorn is
awesome :)  It did require some tweaking of econe-server, which I won't
include here because it was really just a hack to get stuff working.

Next, we have to get control of the http POST timeout since it takes a
couple of minutes to transfer 4 GB.  This could perhaps be done via a
command-line argument, as discussed in http://dev.opennebula.org/issues/195,
but here's a simple workaround patch:

--- a/src/cloud/ec2/bin/econe-upload
+++ b/src/cloud/ec2/bin/econe-upload
@@ -107,7 +107,7 @@ end
 auth = "#{access}:#{secret}" if secret && access
 
 begin
-    ec2_client = EC2QueryClient::Client.new(auth,url)
+    ec2_client = EC2QueryClient::Client.new(auth,url,600)
 rescue Exception => e
     puts "#{cmd_name}: #{e.message}"

With some tweaking I made that work, only to track down another failure,
"execution expired", originating from the xml-rpc server interface in
OpenNebula.rb.  That timeout happens because the 'allocate' method on Image
in the xml-rpc server ends up needing to copy bits and so takes a long time
for a lot of bits.  Here's a quick-and-dirty patch:

--- a/src/oca/ruby/OpenNebula.rb
+++ b/src/oca/ruby/OpenNebula.rb
@@ -109,6 +109,7 @@ module OpenNebula
             end
 
             begin
+                server.timeout = 600
                 response = server.call("one."+action, @one_auth, *args)


So finally I can use econe-upload.  Yay!

But... 'oneimage add' still is broken, with the same
trying-to-copy-a-big-file-into-ram behavior as above.  Before I start
hacking that, I thought it'd be worth asking if anyone has run into this
problem and perhaps addressed it a bit more elegantly than I am?

Thanks!

--Claude




More information about the Users mailing list