Extending STRING Class

This is a discussion on Extending STRING Class within the lisp forums in Programming Languages category; Hi, I want to extend STRING class to be able to store all SUBSEQ combinations of the related string. For this purpose, I created a class like below. (defclass string-with-cached-subseqs (string) ((subseqs :initarg :subseqs :accessor :subseqs-of :type '(simple-array string (* *)) :documentation "2D array of STRING's SUBSEQS where array subscripts are respectively START and END of the related SUBSEQ output."))) For me, suprisingly SBCL complained that The class #<BUILT-IN-CLASS STRING> was specified as a super-class of the class #<STANDARD-CLASS STRING-WITH-CACHED-SUBSEQS>, but the meta-classes #<STANDARD-CLASS BUILT-IN-CLASS> and #<STANDARD-CLASS STANDARD-CLASS> are incompatible. Define a method for SB-MOP:VALIDATE-SUPERCLASS to avoid this error. Error ...

Go Back   Application Development Forum > Programming Languages > lisp

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-21-2008, 03:36 AM
Volkan YAZICI
Guest
 
Default Extending STRING Class

Hi,

I want to extend STRING class to be able to store all SUBSEQ
combinations of the related string. For this purpose, I created a
class like below.

(defclass string-with-cached-subseqs (string)
((subseqs
:initarg :subseqs
:accessor :subseqs-of
:type '(simple-array string (* *))
:documentation "2D array of STRING's SUBSEQS where array
subscripts are respectively START and END of the related SUBSEQ
output.")))

For me, suprisingly SBCL complained that

The class #<BUILT-IN-CLASS STRING> was specified as
a
super-class of the class
#<STANDARD-CLASS STRING-WITH-CACHED-SUBSEQS>, but
the
meta-classes #<STANDARD-CLASS BUILT-IN-CLASS>
and
#<STANDARD-CLASS STANDARD-CLASS> are incompatible. Define
a
method for SB-MOP:VALIDATE-SUPERCLASS to avoid this error.

Error message explicitly explains the way to fix the problem. But if
I'd use SB-MOP:VALIDATE-SUPERCLASS it'll break to compatibility of my
code for other lisp implementations. Here are my questions:

1. Is there any MOP package I can use to implement VALIDATE-SUPERCLASS
portably.

2. Am I on the wrong way by trying to extend the STRING class. I
thought it would make things easier by being able to benefit from
STRING specialized methods. How should I impelement what I want to
achieve?


Regards.
Reply With Quote
  #2  
Old 08-21-2008, 03:59 AM
Rainer Joswig
Guest
 
Default Re: Extending STRING Class

In article
<f9e9f800-f928-46f9-b2ea-cec486df6e79@c65g2000hsa.googlegroups.com>,
Volkan YAZICI <volkan.yazici@gmail.com> wrote:

> Hi,
>
> I want to extend STRING class to be able to store all SUBSEQ
> combinations of the related string. For this purpose, I created a
> class like below.
>
> (defclass string-with-cached-subseqs (string)
> ((subseqs
> :initarg :subseqs
> :accessor :subseqs-of
> :type '(simple-array string (* *))
> :documentation "2D array of STRING's SUBSEQS where array
> subscripts are respectively START and END of the related SUBSEQ
> output.")))


This will not work. You can't extend STRING that way.

CL-USER 11 > (find-class 'string)
#<BUILT-IN-CLASS STRING 40F0334C83>

STRING is a built-in-class and not a CLOS standard class.
You can't define subclasses of string. The class hierarchy
for built-in classes is fixed and not extensible for the user.

Sometimes you wish to be able to do that. But ANSI CL
does not have that.

Compare that with the Dylan idea of 'Collections':

http://lispm.dyndns.org/documentatio...ated/ch12.html

....


> The class #<BUILT-IN-CLASS STRING> was specified as
> a
> super-class of the class
> #<STANDARD-CLASS STRING-WITH-CACHED-SUBSEQS>, but
> the
> meta-classes #<STANDARD-CLASS BUILT-IN-CLASS>
> and
> #<STANDARD-CLASS STANDARD-CLASS> are incompatible. Define
> a
> method for SB-MOP:VALIDATE-SUPERCLASS to avoid this error.
>
> Error message explicitly explains the way to fix the problem. But if
> I'd use SB-MOP:VALIDATE-SUPERCLASS it'll break to compatibility of my
> code for other lisp implementations. Here are my questions:
>
> 1. Is there any MOP package I can use to implement VALIDATE-SUPERCLASS
> portably.
>
> 2. Am I on the wrong way by trying to extend the STRING class. I
> thought it would make things easier by being able to benefit from
> STRING specialized methods. How should I impelement what I want to
> achieve?



You can write STRING specialized methods:

(defmethod foo ((s string))
(do-something s))

But you can't redefine existing functions like LENGTH
and you can't subclass built-in classes.
LENGTH is a normal function and you are not allowed
(in portable code) to replace it with a generic function.

The typical way to define for example your own version
of length, is to do it in a package:


; package MY-EXTENDED-COMMON-LISP. MECL. Has its own symbol LENGTH
; exported...

(defmethod mecl:length ((s string)
(cl:length s))

and so on. But that's for experts and quite a bit work
to get things right.

Possibly there is already something in the direction - haven't looked
for that yet.


>
>
> Regards.


--
http://lispm.dyndns.org/
Reply With Quote
  #3  
Old 08-21-2008, 05:04 AM
Tobias C. Rittweiler
Guest
 
Default Re: Extending STRING Class

Rainer Joswig <joswig@lisp.de> writes:

> STRING is a built-in-class and not a CLOS standard class.
> You can't define subclasses of string. The class hierarchy
> for built-in classes is fixed and not extensible for the user.
>
> Sometimes you wish to be able to do that. But ANSI CL
> does not have that.
>
> Compare that with the Dylan idea of 'Collections':
>
> http://lispm.dyndns.org/documentatio...ated/ch12.html


While not ANSI CL, SBCL supports extensible sequences.

The OP should take a look at

http://www.doc.gold.ac.uk/~mas01cr/p...s-20070301.pdf

-T.
Reply With Quote
  #4  
Old 08-21-2008, 05:21 AM
Rainer Joswig
Guest
 
Default Re: Extending STRING Class

In article <87wsia20xw.fsf@freebits.de>,
"Tobias C. Rittweiler" <tcr@freebits.de.invalid> wrote:

> Rainer Joswig <joswig@lisp.de> writes:
>
> > STRING is a built-in-class and not a CLOS standard class.
> > You can't define subclasses of string. The class hierarchy
> > for built-in classes is fixed and not extensible for the user.
> >
> > Sometimes you wish to be able to do that. But ANSI CL
> > does not have that.
> >
> > Compare that with the Dylan idea of 'Collections':
> >
> > http://lispm.dyndns.org/documentatio...ated/ch12.html

>
> While not ANSI CL, SBCL supports extensible sequences.
>
> The OP should take a look at
>
> http://www.doc.gold.ac.uk/~mas01cr/p...s-20070301.pdf
>
> -T.


Right, that's an very good paper. It describes how to
extend and subclass sequences. That's certainly very useful
(currently I have a use for that, too).
But it still does not allow one to subclass strings and
extend the various operations on strings
(MAKE-STRING, STRING-TRIM, STRING-UPCASE, STRINGP, STRING=, CHAR, ...).
Also one would want the other operations that use strings
would accept (or create strings using) the new string class.
Much of the base language of ANSI CL is not that extensible.

Still, if there would be ever a revision of ANSI CL, the
above mentioned extensible sequences would be a useful contribution.

--
http://lispm.dyndns.org/
Reply With Quote
  #5  
Old 08-21-2008, 08:00 AM
Pascal J. Bourguignon
Guest
 
Default Re: Extending STRING Class

Volkan YAZICI <volkan.yazici@gmail.com> writes:

> Hi,
>
> I want to extend STRING class to be able to store all SUBSEQ
> combinations of the related string. For this purpose, I created a
> class like below.
>
> (defclass string-with-cached-subseqs (string)
> ((subseqs
> :initarg :subseqs
> :accessor :subseqs-of
> :type '(simple-array string (* *))
> :documentation "2D array of STRING's SUBSEQS where array
> subscripts are respectively START and END of the related SUBSEQ
> output.")))
>
> For me, suprisingly SBCL complained that
>
> The class #<BUILT-IN-CLASS STRING> was specified as


Yes.
Rainer explained why.


> 1. Is there any MOP package I can use to implement VALIDATE-SUPERCLASS
> portably.


CLOSER-MOP Find it on cliki.net

> 2. Am I on the wrong way by trying to extend the STRING class. I
> thought it would make things easier by being able to benefit from
> STRING specialized methods. How should I impelement what I want to
> achieve?


So you want to keep all subseq combinations related to a string?
There's no need to keep them encapsulated with the lisp string itself.
Of course, you can still abstract away that encapsulation.
So you want:

(subseqs-of string) --> 2d-array-of-substrings


To implement that portably, there will be the difficulty that the
string won't be garbage collected as long as we keep a reference to
it, so we'll need a method to "free" it.

(free string)


An alternative would be to use an implementation specific weak
something of some kind (weak hash tables, weak pointers, etc).
(or a compatibility layer such as
http://darcs.informatimago.com/lisp/...oser-weak.lisp )



But let's stay 100% CL portable.

(defvar *subseqs-of-string-map* (make-hash-table))
;; Note since objects have EQL identity, we use an EQL hash table for
;; our strings. That's what you're asking for...


(defmethod subseqs-of ((self string))
(or (gethash self *subseqs-of-string-map*)
(setf (gethash self *subseqs-of-string-map*)
(compute-subseqs-of-string self))))

(defmethod free ((self string))
(remhash self *subseqs-of-string-map*)
self)

(defmethod compute-subseqs-of-string ((self string))
(loop
:with result = (make-array (list (1+ (length string)) (1+ (length string))))
:for start :from 0 :to (length string)
:do (loop :for end :from 0 :to (length string)
:do (setf (aref result start end)
(COM.INFORMATIMAGO.COMMON-LISP.UTILITY:NSUBSEQ string start end)))
:finally (return result)))

Notice the use of NSUBSEQ which builds displaced arrays to avoid
copying the original characters N^2 times...



This is still quite a waste, do you really need to keep all these
substrings? NSUBSEQ is rather efficient, and if you need to scan over
the subseqs often, you could even use only one displaced array that
you would update, having thus only one pointer and one length to
update from one substring to the other.


NSUBSEQ can be found in:
http://darcs.informatimago.com/lisp/...p/utility.lisp

--
__Pascal Bourguignon__
Reply With Quote
  #6  
Old 08-21-2008, 10:19 AM
Volkan YAZICI
Guest
 
Default Re: Extending STRING Class

On Aug 21, 3:00 pm, p...@informatimago.com (Pascal J. Bourguignon)
wrote:
> So you want to keep all subseq combinations related to a string?
> There's no need to keep them encapsulated with the lisp string itself.
> Of course, you can still abstract away that encapsulation.
> So you want:
>
> (subseqs-of string) --> 2d-array-of-substrings


That is exactly what I want to achieve.

> To implement that portably, there will be the difficulty that the
> string won't be garbage collected as long as we keep a reference to
> it, so we'll need a method to "free" it.
>
> (free string)
>
> An alternative would be to use an implementation specific weak
> something of some kind (weak hash tables, weak pointers, etc).
> (or a compatibility layer such as
> http://darcs.informatimago.com/lisp/...oser-weak.lisp)
>
> But let's stay 100% CL portable.


Please.

> (defvar *subseqs-of-string-map* (make-hash-table))
> ;; Note since objects have EQL identity, we use an EQL hash table for
> ;; our strings. That's what you're asking for...
>
> (defmethod subseqs-of ((self string))
> (or (gethash self *subseqs-of-string-map*)
> (setf (gethash self *subseqs-of-string-map*)
> (compute-subseqs-of-string self))))
>
> (defmethod free ((self string))
> (remhash self *subseqs-of-string-map*)
> self)


Hrm... I see. That's fine too. But AFAIK, there is no FREE in ANSI CL.
Am I mistaken? How is FREE supposed to work?

> Notice the use of NSUBSEQ which builds displaced arrays to avoid
> copying the original characters N^2 times...


Yep, that is something I had in mind to use.

> This is still quite a waste, do you really need to keep all these
> substrings?


I'll use pre-computed SUBSEQ values for fast dictionary look-ups.


Regards.
Reply With Quote
  #7  
Old 08-21-2008, 11:29 AM
Rob Warnock
Guest
 
Default Re: Extending STRING Class

Volkan YAZICI <volkan.yazici@gmail.com> wrote:
+---------------
| p...@informatimago.com (Pascal J. Bourguignon) wrote:
| > (defmethod free ((self string))
| > (remhash self *subseqs-of-string-map*)
| > self)
|
| Hrm... I see. That's fine too. But AFAIK, there is no FREE in ANSI CL.
| Am I mistaken? How is FREE supposed to work?
+---------------

All it does it remove the argument from the hash table, thus
removing the reference to the argument *by* the hash table that
would prevent it from being GC'd. [There may still be *other*
references to the object in your code; it's up to you to remove
or overwrite those.]

You are correct that there is no built-in FREE in CL. But
all you have to do in CL to "free" an object is remove *all*
references to the object [turning it into "garbage"], and
the GC will (eventually) come along and discover the lack of
any such references and collect the space used by the object.


-Rob

-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607

Reply With Quote
  #8  
Old 08-21-2008, 03:27 PM
Don Geddis
Guest
 
Default Re: Extending STRING Class

rpw3@rpw3.org (Rob Warnock) wrote on Thu, 21 Aug 2008:
> You are correct that there is no built-in FREE in CL. But all you have to
> do in CL to "free" an object is remove *all* references to the object
> [turning it into "garbage"], and the GC will (eventually) come along and
> discover the lack of any such references and collect the space used by the
> object.


You know, everybody describes it that way. But GCs really work by finding
still-used memory (and copying it elsewhere), not by finding garbage. They
never actually "discover the lack of references". There's not an explicit
step in the algorithm, where it suddenly concludes: "Ah ha! There are no
longer any references to this little block of memory! I'll collect it."

Obviously you know this already, but I always found it ironic that the
standard one-line explanation (and even the name!) of "garbage collection"
isn't really correct.

-- Don
__________________________________________________ _____________________________
Don Geddis http://don.geddis.org/ don@geddis.org
Fun Facts for the Psychotic:
The cuddly "Koala bear" of Australia is actually not a "bear" at all...
....it is a telepathic cyborg that reports your thoughts to "The Council"!
-- Ruben Bolling, "Tom The Dancing Bug", #746
Reply With Quote
  #9  
Old 08-21-2008, 05:26 PM
Robert Swindells
Guest
 
Default Re: Extending STRING Class

On Thu, 21 Aug 2008 12:27:40 -0700, Don Geddis wrote:

> rpw3@rpw3.org (Rob Warnock) wrote on Thu, 21 Aug 2008:
>> You are correct that there is no built-in FREE in CL. But all you have
>> to do in CL to "free" an object is remove *all* references to the
>> object [turning it into "garbage"], and the GC will (eventually) come
>> along and discover the lack of any such references and collect the
>> space used by the object.

>
> You know, everybody describes it that way. But GCs really work by
> finding still-used memory (and copying it elsewhere), not by finding
> garbage. They never actually "discover the lack of references".
> There's not an explicit step in the algorithm, where it suddenly
> concludes: "Ah ha! There are no longer any references to this little
> block of memory! I'll collect it."


Mark-and-sweep really can work in the way Rob Warnock described.

One example is the GC in Franz Lisp
<http://www.cs.cmu.edu/afs/cs/project...i/lang/others/
franzlsp/op38_93b/franzlsp.tgz>

Robert Swindells
Reply With Quote
  #10  
Old 08-21-2008, 05:27 PM
Robert Swindells
Guest
 
Default Re: Extending STRING Class

On Thu, 21 Aug 2008 12:27:40 -0700, Don Geddis wrote:

> rpw3@rpw3.org (Rob Warnock) wrote on Thu, 21 Aug 2008:
>> You are correct that there is no built-in FREE in CL. But all you have
>> to do in CL to "free" an object is remove *all* references to the
>> object [turning it into "garbage"], and the GC will (eventually) come
>> along and discover the lack of any such references and collect the
>> space used by the object.

>
> You know, everybody describes it that way. But GCs really work by
> finding still-used memory (and copying it elsewhere), not by finding
> garbage. They never actually "discover the lack of references".
> There's not an explicit step in the algorithm, where it suddenly
> concludes: "Ah ha! There are no longer any references to this little
> block of memory! I'll collect it."


Mark-and-sweep really can work in the way Rob Warnock described.

One example is the GC in Franz Lisp
<http://www.cs.cmu.edu/afs/cs/project...i/lang/others/
franzlsp/op38_93b/franzlsp.tgz>

Robert Swindells
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 04:09 PM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.