Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: TECH: Re: Summary, web based mailinglist archives

From:taliesin the storyteller <taliesin@...>
Date:Monday, October 25, 1999, 13:40
* Paul.Bennett@xncorp.com (Paul.Bennett@xncorp.com) [991025 14:29]:
> Yes. I was never suggesting REing a "monolithic bag o' bits" (TM). What > I feel it needs is a fairly compute- and space-intensive phase when a > new message (set) is added. Indexes of indexes and all that funky stuff > seems to be the order of the day, as well as a cute little trick that I > call (after the guy who explained it to me) "Julian" Indexes (more on > this is available for the terminally curious, it's not super techie, but > if you've never come across it, it blows your mind at first).
Is your 'Julian Indexes' indexing on words, the words pointing to the documents that contain the words? This is called an 'inverted filesystem' in IR, and is precisely -not- what would suit conlang-l, I've already built a simple ir-system ('twas for class), and used one month of conlang-mail (may 98?) as the main dataset. No go. I could write on vector-search and clustering techniques and stemming algorithms too if you like :) <warning>I *am* a perfectionist</warning> tal. -- "Better living through conlanging"