Monday, August 11, 2014

How to implement statfs syscall for Ruby 1.8 (pre Fiddle)

Here is statfs implementation for older Ruby releases that don't have Fiddle:

#!/usr/bin/ruby

# gcc -x c -E <(echo "#include <sys/vfs.h>") | less
# struct statfs {
#   long int f_type
#   long int f_bsize
#   unsigned long int f_blocks
#   unsigned long int f_bfree
#   unsigned long int f_bavail
#   unsigned long int f_files
#   unsigned long int f_ffree
#   struct { int __val[2] } f_fsid
#   long int f_namelen
#   long int f_frsize
#   long int f_flags
#   long int f_spare[4]
# }
# extern int statfs (const char *__file, struct statfs *__buf)

require 'ostruct'

module Sys
  # /usr/include/asm-x86_64/unistd.h
  SYS_STATFS = 137

  SENTINEL = 0x91d210f49de98115
  FMT_STATFS = "Q16"

  # Use SENTINEL to check that syscall doesn't overflow the buffer.
  # BTW there is a bug in newer (post 1.8.6) ruby's syscall: you cannot
  # provide buffer with NUL octets :( Luckily in this case you can fill
  # the whole buffer with non-NUL octets because buffer is only used
  # for returning data from the syscall

  def self.statfs(filename)
    buf = ([SENTINEL] * 16).pack FMT_STATFS
    syscall SYS_STATFS, filename, buf

    k = %w{type bsize blocks bfree bavail files ffree fsid namelen frsize
           flags spare1 spare2 spare3 spare4}
    v = buf.unpack FMT_STATFS
    raise TypeError if v.pop != SENTINEL

    OpenStruct.new k.zip(v)
  end
end

st = Sys.statfs('/tmp')
puts st.inspect

Output:

#<OpenStruct bavail=212090654, type=61267, flags=4128, files=60506112,
spare1=0, blocks=238218731, ffree=59367476, spare2=0, fsid=15391897869155704811,
spare3=0, bsize=4096, namelen=255, spare4=0, bfree=224191466, frsize=4096>

Saturday, August 9, 2014

How to implement statfs syscall with Ruby Fiddle FFI

Ruby will retire its broken Kernel#syscall method in the future. In the current Ruby releases you cannot provide NUL octets to syscalls which breaks almost all syscalls using a struct to provide parameters to it (Ruby bug 1472). Luckily starting Ruby 1.9.2 there is Fiddle standard library that can be used to call foreign functions like C-library's functions, including the wrappers for syscalls. Unfortunate part is that Fiddle is still (2.1.2) poorly documented and there are very few sites with examples to implement more complex cases than calling sin(), strcpy(), or strdup(). Now I have implemented statfs() syscall using Fiddle:

#!/usr/bin/ruby

# gcc -x c -E <(echo "#include <sys/vfs.h>") | less
# struct statfs {
#   long int f_type
#   long int f_bsize
#   unsigned long int f_blocks
#   unsigned long int f_bfree
#   unsigned long int f_bavail
#   unsigned long int f_files
#   unsigned long int f_ffree
#   struct { int __val[2] } f_fsid
#   long int f_namelen
#   long int f_frsize
#   long int f_flags
#   long int f_spare[4]
# }
# extern int statfs (const char *__file, struct statfs *__buf)

require 'fiddle'
require 'fiddle/import'

module Sys
  module Int
    extend Fiddle::Importer
    dlload Fiddle.dlopen(nil)   # open myself, including libc

    Struct_statfs = struct <<-EOS.gsub %r{^\s+}, ''
      long type,
      long bsize,
      unsigned long blocks,
      unsigned long bfree,
      unsigned long bavail,
      unsigned long files,
      unsigned long ffree,
      int fsid[2],
      long namelen,
      long frsize,
      long flags,
      long spare[4]
    EOS

    extern 'int statfs (const char *__file, struct statfs *__buf)'
  end

  def self.statfs(file)
    buf = Int::Struct_statfs.malloc
    val = Int::statfs(file, buf)
    raise SystemCallError.new(Fiddle.last_error) unless val == 0
    buf
  end
end

buf = Sys.statfs('/tmp')

puts "type of file system: #{buf.type}"
puts "optimal transfer block size: #{buf.bsize}"
puts "total data blocks in file system: #{buf.blocks}"
puts "free blocks in fs: #{buf.bfree}"
puts "free blocks available to unprivileged user: #{buf.bavail}"
puts "total file nodes in file system: #{buf.files}"
puts "free file nodes in fs: #{buf.ffree}"
puts "file system id: #{buf.fsid}"
puts "maximum length of filenames: #{buf.namelen}"
puts "fragment size: #{buf.frsize}"
puts "flags: #{buf.flags}"
puts "spare: #{buf.spare}"

Output:

type of file system: 16914836
optimal transfer block size: 4096
total data blocks in file system: 983850
free blocks in fs: 972634
free blocks available to unprivileged user: 972634
total file nodes in file system: 983850
free file nodes in fs: 980728
file system id: [0, 0]
maximum length of filenames: 255
fragment size: 4096
flags: 32
spare: [0, 0, 0, 0]

Compared to the output of df command:

% df --block-size=4k /tmp
Filesystem     4K-blocks  Used Available Use% Mounted on
tmpfs             983850 11216    972634   2% /tmp
% df -i /tmp
Filesystem     Inodes IUsed  IFree IUse% Mounted on
tmpfs          983850  3122 980728    1% /tmp

So this is fast way to get file system statistics like disk usage in Ruby. You don't anymore need to use df command with ticks and regexps to get the stats.

It seems that in the future (2.2?) Ruby core will have statfs() as built-in: File::Statfs. In the meantime Fiddle version can be handy.

Tuesday, July 29, 2014

Or, how to get the anonymous module from Kernel#load with wrap=true?

With the same technique than in the previous post it is also possible to capture the anonymous wrapper module that Kernel#load method with wrap=true uses to build the sandbox. This way you can load configuration or other data without polluting your namespace.

File b.rb:

class Foo
  def bar
    puts "bar"
  end
end

throw :wrapper, Module.nesting.last

File main2.rb:

mod = catch :wrapper do
  load File.expand_path('b.rb'), true
end
print "Anonymous module is '#{mod}' (#{mod.class})\n"
mod::Foo.new.bar
print "This will not work due the sandbox:\n"
Foo.new.bar

Output:

Anonymous module is '#<Module:0x007fbccc2ace08>' (Module)
bar
This will not work due the sandbox:
main2.rb:7:in `<main>': uninitialized constant Foo (NameError)

This can be packaged to a module for easier usage. Unfortunately the call of Module.nesting needs to be lexically in the loaded file for this to work so it cannot be embedded to Sandbox module. One implementation could be the following.

File c.rb:

class Foo
  def bar
    puts "bar"
  end
end

Sandbox.exit Module.nesting.last

File main3.rb:

module Sandbox
  def self.load(file)
    catch :sandbox do
      Kernel.load file, true
    end
  end

  def self.exit(value)
    throw :sandbox, value
  end
end

mod = Sandbox.load('c.rb')
print "Anonymous module is '#{mod}' (#{mod.class})\n"
mod::Foo.new.bar
print "This will not work due the sandbox:\n"
Foo.new.bar

Output:

Anonymous module is '#<Module:0x007f4d8b4f8500>' (Module)
bar
This will not work due the sandbox:
main3.rb:16:in `<main>': uninitialized constant Foo (NameError)

Simple and elegant :)

How to return value from loaded file?

Have you ever wanted to get return value from file which is read with Kernel#load without resorting to global variables or constants. It is actually quite simple when you realize that you can use non-local exit (continuations) to do that.

File a.rb:

class Foo
  def bar
    puts "bar"
  end
end

throw :value, "hello"

File main.rb:

value = catch :value do
  load File.expand_path('a.rb')
end
print "a.rb returned value '#{value}'\n"
Foo.new.bar

Output:

a.rb returned value 'hello'
bar