Pointer Artithmetic on DB_File values: i need it (or a very efficient solution for this problem) - Perl
This is a discussion on Pointer Artithmetic on DB_File values: i need it (or a very efficient solution for this problem) - Perl ; Hope that some perl guru will help me:
I've implemented an inverted index using PERL + DB_File (using hash,
and b-tree). Therefore, i have my indexer and my searcher (a little
search engine, indead).
Given a word in the lexicon ...
-
Pointer Artithmetic on DB_File values: i need it (or a very efficient solution for this problem)
Hope that some perl guru will help me:
I've implemented an inverted index using PERL + DB_File (using hash,
and b-tree). Therefore, i have my indexer and my searcher (a little
search engine, indead).
Given a word in the lexicon i stored in B-tree the corrispondent
inverted list,
using gap coding. For istance i have:
word_r -> id_1r:freq_1r,id_2r:freq_2r,id_3r:freq_3r,........,
id_nr:freq_nr
word_s -> ...
where freqij is just the freq of word_j in document doc_i.
gap encoding means that id_ij is expressed as difference with
id_(i-1)j
Now, i need to write some Cursor obj with state (call it
postingCursor).
The posting in fetched in a scalar and (a reference to it!) is saved
in the Cursor.
I need to write efficiently a function:
$postingCursorOBJ->getNext($minID)
which should return the next available (if any) id_lr >= minID.
1) I need some way to fetch from the disk, just the portion of
inverted list involved in current operations. Do tie or DBI support
this?
2) I need a way to unroll incrementally the inverted list ? Is this a
case
where pointer arithmethic could help ? or there is any efficent
solution
which i missing. I think that splitting it is the wrong way to do it.
PS: the original problem is to find the common IDs, given two inverted
lists
L1 and L2 which store IDs sorted in ascending way. This should be done
in
O(min {n, m)) instead of O(m+n).
-
Re: Pointer Artithmetic on DB_File values: i need it (or a very efficient solution for this problem)
gulli@di.unipi.it (Antonio) wrote in message news:<6c6b5814.0307080048.2bf86f20@posting.google.com>...
>
> PS: the original problem is to find the common IDs, given two inverted
> lists
> L1 and L2 which store IDs sorted in ascending way. This should be done
> in
> O(min {n, m)) instead of O(m+n).
It seems that i found a solution using index and substr (it's my fault
to forget about the importance of index !).
A question: given a scalar, do index/substr access to the specified offset
in O(1)?
Similar Threads
-
By Application Development in forum labview
Replies: 0
Last Post: 10-29-2007, 09:40 PM
-
By Application Development in forum Python
Replies: 1
Last Post: 10-13-2007, 08:21 AM
-
By Application Development in forum c++
Replies: 17
Last Post: 07-10-2007, 05:32 PM
-
By Application Development in forum JDBC JAVA
Replies: 1
Last Post: 03-23-2006, 02:05 AM
-
By Application Development in forum Perl
Replies: 2
Last Post: 08-18-2005, 12:45 PM