GILS GILS logo Home |  About |  Technology |  Standards |  Policy |  Index |  Search |

UDDI to GILS Gateway
(under development by Matthew Dovey, et al, Oxford University) 

GILS Technical Topic page on UDDI

On This Page:


UDDI Request
find_Business

UDDI/GILS Gateway

UDDI Response
businessList

Gateway Functions

1. search public UDDI database,
giving public businessList
2. search GILS sources
giving GILS businessList

3. merge and de-dup ("OR") the public businessList with GILS businessList, 
giving UDDI businessList

1. Search public UDDI database

1.1 Send UDDI request, e.g.,

<find_business maxRows="100">
  <name>Utah</name>
</find_business>

1.2 Hold UDDI response as “public businessList”, e.g.,

<businessInfo businessKey="80A15BE8... A4F5">
   <name>BIBLIOGRAPHY OF UTAH GEOLOGY</name>


2. Search GILS sources

2.0 Note: A "zurl database" is maintained in the Gateway that correlates a public UDDI businessKey uuid (universally unique identifier) with the identifier for a businessEntity record retrievable through the other Gateway sources, here referred to as a "zurl"  (see zurl note below).

2.1 Pass find_business UDDI request to other gateway sources

2.1.1 Records are to be retrieved from GILS sources up to the limit given by the maxRows attribute of the find_business request (see query note below)

2.2 Identify retrieved records against zurl database and public UDDI register

2.2.1 If businessEntity retrieved from GILS does not exist in public register, register it and identify the record by its businessKey uuid in the zurl database

2.2.2 If businessEntity is already in the public register but is not equivalent (see comparison note below), update it

2.3 Hold retrieved and identified records as "GILS businessList"


3. Merge public with GILS and de-dup

3.1 Merge public and GILS businessList's, using scores or other criteria  (see ranking note below).

3.2 De-dup records in merged businessList using businessKey

3.3 Pass up to maxRows records from merged businessList as UDDI response businessList


Example: find_business

<?xml version="1.0" encoding="UTF-8" ?>
<Envelope  xmlns=“.../soap/envelope/">
  <Body>
    <find_business generic="1.0" maxRows="100">
     <name>Utah</name>
   </find_business>
  </
Body>
</
Envelope>

Example: businessList

<?xml version="1.0" encoding="UTF-8" ?>
<Envelope xmlns=“.../soap/envelope/">
 <Body>
  <businessList generic="1.0" truncated="false">
   <businessInfos>
     <
businessInfo businessKey="163.1.91.181-5e9756:e4a4de6208:-7fb8">
     <name>BIBLIOGRAPHY OF UTAH GEOLOGY</name>
 
   <description>Keyworded compilation of 11,300 
                          bibliographic entries
</description>
 
    <serviceInfos>
       <
serviceInfo
            businessKey
="80A15BE8-1C18-47B4-8A4D-5A047277A4F5"
           
serviceKey="6C80C032-AB12-4A72-B27B-2E03DF285818">
       <name>Z39.50 Service</name>
      </serviceInfo>
      </serviceInfos>
    </
businessInfo>
   </
businessInfos>
  </
businessList>
 </
Body>
</
Envelope>

Narrative (from Matthew's e-mail note)

I'm assuming that there is a GILS defined persistent DocId or zurl which we can use to identify and retrieve any given GILS record - we can sort out the details of this later. For now I'll just refer to this as DocId. The gateway has a local database of UDDI uuid's and their corresponding GILS DocId.

On receiving a find..., the gateway performs a search on the GILS server(s) and a UDDI server (e.g. Microsoft). For each GILS record, the gateway checks whether there is a corresponding entry in the database for the DocId.

if there is, the record for the corresponding uuid is pulled from the UDDI server and compared with the GILS record (converted to UDDI), if there are changes an update is posted to the UDDI server.

if there isn't, an add is posted to the UDDI server and the corresponding uuid is stored against the DocId in the local database.

The GILS records (converted to UDDI and with uuid from the local database) are then merged with the UDDI records from the UDDI server (de-duping on uuid) and the full set returned (this is necessary to meet the requirement that searching any UDDI server should produce the same result as searching any other).


Notes on Outstanding Issues

comparison
A comparison is needed to determine when a public UDDI businessDetail record matches what is retrieved from the GILS source and converted to a UDDI businessDetail record. Both records will have been cast as XML DOM (Document Object Model). Equivalence should ignore white space, UDDI keys, and order of elements where the parent is a "bag".
One suggestion is to use a hash function for comparison. The hash function would have to be commutative over a bag. Another suggestion is that two DOM nodes might be regarded as equivalent if they have identical child nodes (count and names), and if the length of the text values at each child node are equal between the two nodes. This suggestion ignores values in attributes and will miss changes wherein text has been replaced with other text of exactly the same length.
query
UDDI is currently designed from the perspective of "Data Query" rather than "Information Retrieval". Data Query is less effective when the database is heavily populated or typical queries are "fuzzy" (e.g., full recall is less important than relevance ranking). UDDI may also have performance problems when large results are instantiated regardless of whether the searcher is likely to request all of the records. 
UDDI has the ability in a request to specify the maximum rows and in a response to specify whether the result has been truncated. It does not have the notion of a cursor to position within a table, nor the ability to handle named result sets.
ranking
These questions are not unique to this gateway but occur whenever search is required to combine results from sources that do not have precisely equivalent ranking schemes and comparable records.
In the merging of multiple businessList's, there is an issue of how to rank the records of each. Say, for instance, the find_business request specifies a max of 100 rows (records), the public businessList contains 75 matching records and the GILS businessList contains 75 matching records. The GILS businessList includes a relevance score for each record but the public businessList records have no scores. If the relevance scores are to be considered in the ranking, should a score be forced for the public records? If so, should the forcing be contrived to favor one or the other lists or to spread the selection? If not, on what other basis should records be drawn from the two lists?
zurl
If the retrieval zurl (i.e., "z3950://host:port/dbname?docId") were both globally unique and persistent, a simple cross-reference between UDDI uuid and retrieval zurl would satisfy. But, a zurl with a docId is often no more persistent than the interval between re-indexing of a database. And, one would have difficulty distinguishing duplicates due to the fact that the same business can be found through various z39.50 servers.
Persistence is a thorny problem and UDDI may not have a complete solution yet, either. For the time being, we can pretend that a retrieval zurl is as globally unique and persistent as the UDDI uuid. Periodically, we can enforce alignment between the two identifier spaces by retrieving everything registered by the UDDI/GILS Gateway and running a batch comparison of retrieval zurl's to UDDI records. For any broken DocID, the batch program would send a delete_business message to the public UDDI register.

Test Site

(Use with UDDI client such as Microsoft's apiExplore) http://163.1.91.181:8080/uddi/servlet/registry


Presentations

UDDI by Chris Kurt, UDDI.ORG Program Manager

GILS and UDDI by Eliot Christian


RSS XML icon Comments |  Privacy Notice |  URL:/uddi2gils.html