Jul 15, 2008

SOAP gets in your eyes


I have a few days between a recently finished contract and before I start the next one. I've decided to use that time to learn a bit of Ruby and the Rails framework for a small project. I'm putting something together to do time tracking and communicate with the Professional Services Automation software that we use in Verilab. As ever this sort of learning only really happens on an 'as-needed' basis so I think that a small driver project will move things along.

One of the first things I've been working on is the underlying communications with the web services interface that Projector provides into their database. They use a SOAP interface, with a Web Services Description Language (WSDL) representation of the API. This WSDL file is a machine-readable, XML description of all of the API calls and expected types for those calls. You can interact with the SOAP interface directly, constructing the XML to place the request and then parsing the responses manually. However, that becomes painful very quickly, as the calls are very verbose and unwieldy. The solution is to use one of the various SOAP frameworks available, that interrogate the WSDL and then generate objects and methods to encapsulate the interface.

This all seemed mostly reasonable and I got a copy of soap4r which is the default Ruby SOAP interface. The latest version supports two interfaces, dynamically parsing the WSDL and generating object factories, or a script that statically parses the WSDL and generates a variety of helper classes that can be used to build the SOAP calls.

At this point, the almost total lack of documentation for soap4r started to bite me. There is plenty of sample code, assuming you only ever want to pass a string (like a stock ticker) and only ever really expect a single integer or float to come back (such as a stock price). Very few examples go much further than that, but the Projector SOAP API uses a variety of heavily nested complexTypes and it wasn't very clear at all how to access or manipulate them. I made the initial mistake of trying to use the dynamic WSDL parsing, but after a while switched to using the statically generated classes which helped somewhat. At least then I could read the source and see what the member variables were in the classes and also what the class names were. Part of the problem seems to be that the Ruby world prefers the RESTful approach to web services, so SOAP is something of an ugly step-child. But SOAP is what I have to work with to get the information I need.

At one point, I abandoned the Ruby version and tried to build an equivalent set of queries in Python. I'm more familiar with that language and thought it might remove one of the levels of complexity from the problem. In the Python world, I tried using the SUDS framework to manage the WSDL. In this case, SUDS only supports dynamic parsing of the WSDL file and this parsing is quite a computationally expensive task. It doesn't make for fast, iterative exploration when it takes about 30 seconds to start up the script each time. I wasn't able to pickle the results to cache the driver either. Again, the SUDS framework has a real dearth of documentation - in fact it is even more sparse than soap4r. However, poking around at the classes using the introspection features of Python helped me get a bit further along and also cast the Ruby experience in a different light. I was able to take what I'd learned in Python and apply it to the Ruby scripts and made quite a bit more progress.

I've been learning bits and pieces of Ruby along the way, too. Ruby is also a dynamic language with introspection, so I was able to start poking around in the objects, printing out methods and instance_variables to see what was going on. The interactive command line in Python is fantastic for doing this sort of exploration. I haven't yet found an equivalently powerful command line/ interactive way of doing this sort of playing around in Ruby (feel free to let me know how!). By some trial and error and dumping objects along the way I was able to get the data I wanted.

So a day and a half later, I now have a simple Ruby script that can talk to the ProjectorPSA SOAP API and query the list of active projects, then print out and count that list. Painful to get up the learning curve, but now at least I know how to work with the SOAP framework and make the method calls that I need. The equivalent Python script is almost there, but with a missing namespace in the generated XML that I haven't quite worked out how to control from the other side of the SUDS framework.

Overall, SOAP still seems very verbose and complex, for what it does - layers of objects, lots of XML, just to do very simple queries. A simple method invocation such as:

<?xml version="1.0" encoding="utf-8" ?>
<env:Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema"
      <n2:request xsi:type="n2:ExportProjectListRq">

The Ruby code to generate this one remote procedure call is equally verbose, even with all the auto generated code within the soap4r framework:

require 'rubygems'
gem 'soap4r'
require 'soap/wsdlDriver'
require 'soap/header/simplehandler'
require 'defaultDriver'

new authentication class to construct proper SOAP Authentication header for

each access to the server

this is idiomatic for the soap4r framework - it is what it is

class ClientAuthHeaderHandler < SOAP::Header::SimpleHandler def initialize(userid, passwd) super(XSD::QName.new("http://www.opsplanning.com/webservices/public/data", "OpsAuthenticationHeader")) @sessionid = nil @userid = userid @passwd = passwd end

def on_simple_outbound if @sessionid { "sessionid" => @sessionid } else { "AccountName" => "verilab", "EmailAddress" => @userid, "Password" => @passwd } end end

def on_simple_inbound(my_header, mustunderstand) @sessionid = my_header["sessionid"] end end

make sure everything is unicode-friendly, just in case

XSD::Charset.encoding = 'UTF8'

create the SOAP driver object to handle the requests

endpoint_url = ARGV.shift driver = OpsProjectorSvcSoap.new(endpoint_url)

enable debug output (showing SOAP XML) if you run this script with ruby -d

driver.wiredump_dev = STDOUT if $DEBUG

set up authentication object

user = "account name here"

uncomment to prompt for the password each time the script runs

passwd = ask("Password:") { |q| q.echo = false }

create the authentication token and stuff it into the driver's

header for every SOAP request that gets generated

auth = ClientAuthHeaderHandler.new user, passwd driver.headerhandler << auth

Wrap the request in a Rq object, inside an ExportProjectList object

means it all unrolls to be the correct SOAP/XML. There may be a more direct

way to do this from just the ExportProjectList and property setting?

:LimitToOpenForTimeOnly => true ???

req = ExportProjectList.new( ExportProjectListRq.new( ExportProjectListRequest.new(true, true, nil, 2000000, nil, false) ) )

make the SOAP call, and extract the exportProjectListResult object

result = driver.exportProjectList(req).exportProjectListResult

display project list. The hierarchy can be intuited from the various bits of ruby

generated by the wsdl2ruby.rb script (defaultMappingRegistry.rb, default.rb et al)

result.data.projectList.project.each { |project| print_project(project) }

The analogous Python code is similarly wordy. As our very bright admin, Will, says about SOAP 'run away, run away'

There are comments.

Comments !