(Not)XQuery

This is a discussion on (Not)XQuery within the lisp forums in Programming Languages category; I have developed some simple tools for working with hierarchical data that basically operate on nested association lists. For example, rs-ref is function that operates on a record-set*, returns a list of record sets, and accepts :key and :test arguments for filtering. A sample query might look like: (rs-ref (as-record-set (root (record (sub-type1 (tagged-val 1)) (sub-type2 (tagged-val 2))) (record (sub-type1 (tagged-val 3)) (sub-type2 (tagged-val 4))))) '(record sub-type1) :key (lambda (rs) (rs-member rs 'tagged-val)) :test (lambda (val) (> val 2))) => (#<[TREE-NODE| :TYPE SUB-TYPE1 :RECORDS (#<[TREE-LEAF|:TYPE TAGGED-VAL :VALUE 3]>)]>) The implementation is not terribly optimized. Usually I read some semi- structured ...

Go Back   Application Development Forum > Programming Languages > lisp

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 11-07-2008, 11:30 AM
Matthew D Swank
Guest
 
Default (Not)XQuery

I have developed some simple tools for working with hierarchical data
that basically operate on nested association lists.

For example, rs-ref is function that operates on a record-set*, returns a
list of record sets, and accepts :key and :test arguments for filtering.

A sample query might look like:
(rs-ref
(as-record-set (root
(record (sub-type1 (tagged-val 1))
(sub-type2 (tagged-val 2)))
(record (sub-type1 (tagged-val 3))
(sub-type2 (tagged-val 4)))))
'(record sub-type1)
:key (lambda (rs) (rs-member rs 'tagged-val))
:test (lambda (val) (> val 2)))

=>
(#<[TREE-NODE|
:TYPE SUB-TYPE1 :RECORDS (#<[TREE-LEAF|:TYPE TAGGED-VAL :VALUE 3]>)]>)

The implementation is not terribly optimized. Usually I read some semi-
structured file into a record-set, then write out some appropriately
massaged query data. It is usually I/O bound, and if I'm consing a lot I
can provide a context for memoization:

(with-rs-cache
<lots of queries and output>)

However, I am starting to do more ad hoc joins on data from different
sources, and it would be nice if I could make this more general and
efficient.

Any ideas?

Matt

*A record-set is anything that implements the record-set protocol, but
currently I have implemented only one set of concrete types.
Reply With Quote
  #2  
Old 11-08-2008, 01:48 PM
Madhu
Guest
 
Default Re: (Not)XQuery


* Matthew D Swank <43_Qk.5154$J21.2343@newsfe01.iad> :
Wrote on Fri, 07 Nov 2008 16:30:56 GMT:

| I have developed some simple tools for working with hierarchical data
| that basically operate on nested association lists.
|
| For example, rs-ref is function that operates on a record-set*, returns a
| list of record sets, and accepts :key and :test arguments for filtering.
|
| A sample query might look like:
| (rs-ref
| (as-record-set (root
| (record (sub-type1 (tagged-val 1))
| (sub-type2 (tagged-val 2)))
| (record (sub-type1 (tagged-val 3))
| (sub-type2 (tagged-val 4)))))
| '(record sub-type1)
| :key (lambda (rs) (rs-member rs 'tagged-val))
| :test (lambda (val) (> val 2)))
|
| =>
| (#<[TREE-NODE|
| :TYPE SUB-TYPE1 :RECORDS (#<[TREE-LEAF|:TYPE TAGGED-VAL :VALUE 3]>)]>)
|
| The implementation is not terribly optimized. Usually I read some semi-
| structured file into a record-set, then write out some appropriately
| massaged query data. It is usually I/O bound, and if I'm consing a lot I
| can provide a context for memoization:
|
| However, I am starting to do more ad hoc joins on data from different
| sources, and it would be nice if I could make this more general and
| efficient.
|
| Any ideas?

Perhaps an existing optimized pattern matcher would work. Here's how the
above example might look in Sunil Mishra's Meta-Circular Pattern
Matcher, MCPat <URL:http://www.sfmishras.com/smishra/mcpat/>

;; Without using :constraints matching,

(defpackage "MCPAT-USER"
(:use "CL" "MCPAT"))
(in-package "MCPAT-USER")

(defvar *sexp* '(root
(record
(sub-type1 (tagged-val 1))
(sub-type2 (tagged-val 2)))
(record
(sub-type1 (tagged-val 3))
(sub-type2 (tagged-val 4)))))


(in-ruleset :MSWANK-RS encoded-ruleset)
(clear-ruleset :MSWANK-RS)

(defrule root-matcher-42
((root . ?records))
`(:tree-node :type sub-type1 :records ,(mapcan 'solve ?records)))

(defrule record-matcher-42
((record . ?subtypes))
(mapcan (lambda (subtype) (solve `(subtype ,subtype))) ?subtypes)))

(defrule subtype-matcher-42
((subtype (?subtype (tagged-val ?val))))
(when (and (eq ?subtype 'sub-type1)
(> ?val 2))
`((:tree-leaf tagged-val ,?val))))

(solve *sexp*)

=> (:TREE-NODE :TYPE SUB-TYPE1 :RECORDS ((:TREE-LEAF TAGGED-VAL 3)))

--
Madhu
Reply With Quote
  #3  
Old 11-08-2008, 07:16 PM
Matthew D Swank
Guest
 
Default Re: (Not)XQuery

On Sun, 09 Nov 2008 00:18:39 +0530, Madhu wrote:

> * Matthew D Swank <43_Qk.5154$J21.2343@newsfe01.iad> : Wrote on Fri, 07
> Nov 2008 16:30:56 GMT:
>
> | I have developed some simple tools for working with hierarchical data
> | that basically operate on nested association lists.

....
> | However, I am starting to do more ad hoc joins on data from different
> | sources, and it would be nice if I could make this more general and |
> efficient.
> |
> | Any ideas?
>
> Perhaps an existing optimized pattern matcher would work. Here's how the
> above example might look in Sunil Mishra's Meta-Circular Pattern
> Matcher, MCPat <URL:http://www.sfmishras.com/smishra/mcpat/>
>
> ;; Without using :constraints matching,
>
> (defpackage "MCPAT-USER"
> (:use "CL" "MCPAT"))
> (in-package "MCPAT-USER")
>
> (defvar *sexp* '(root
> (record
> (sub-type1 (tagged-val 1))
> (sub-type2 (tagged-val 2)))
> (record
> (sub-type1 (tagged-val 3))
> (sub-type2 (tagged-val 4)))))
>
>
> (in-ruleset :MSWANK-RS encoded-ruleset) (clear-ruleset :MSWANK-RS)
>
> (defrule root-matcher-42
> ((root . ?records))
> `(:tree-node :type sub-type1 :records ,(mapcan 'solve ?records)))
>
> (defrule record-matcher-42
> ((record . ?subtypes))
> (mapcan (lambda (subtype) (solve `(subtype ,subtype))) ?subtypes)))
>
> (defrule subtype-matcher-42
> ((subtype (?subtype (tagged-val ?val))))
> (when (and (eq ?subtype 'sub-type1)
> (> ?val 2))
> `((:tree-leaf tagged-val ,?val))))
>
> (solve *sexp*)
>
> => (:TREE-NODE :TYPE SUB-TYPE1 :RECORDS ((:TREE-LEAF TAGGED-VAL 3)))


Interesting, if a bit verbose. It's not immediately evident to me how
solve based one more than one tree. I suppose we're getting in to logic-
programming/db-query-engine land.

If look at it as a representation problem, there are things I like about
nested lists.
1. functional construction
2. preservation of order
3. preservation of multiplicity

However, naive construction of result sets requires lots of intermediate
consing, and joins require linear search.

Using the complete cached representation as _the_ representation would
speed queries up, but not joins. And it would be very expensive (I
suppose this is like the power set?)
Ex.
(root
(record (sub-type1 (tagged-val 1))
(sub-type2 (tagged-val 2)))
(record (sub-type1 (tagged-val 3))
(sub-type2 (tagged-val 4))))

becomes (as a map):

{ nil,
((root
(record (sub-type1 (tagged-val 1))
(sub-type2 (tagged-val 2)))
(record (sub-type1 (tagged-val 3))
(sub-type2 (tagged-val 4)))))

(record),
((record (sub-type1 (tagged-val 1))
(sub-type2 (tagged-val 2)))
(record (sub-type1 (tagged-val 3))
(sub-type2 (tagged-val 4))))

(record sub-type1),
((sub-type1 (tagged-val 1))
(sub-type1 (tagged-val 3)))

.... etc

}

I know a lot of this is obvious, but I am thinking out loud and it's
being caught by a net.

I want to be fast lazy, and functional; is that so hard?

Matt

--
question = ( to ) ? be : ! be;
-- Wm. Shakespeare
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 11:46 PM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.