Conlang: Re: OT: Google Fight! (Dirk Elzinga, Sep 10 '05, 13:45)

Re: OT: Google Fight!

From:	Dirk Elzinga <dirk.elzinga@...>
Date:	Saturday, September 10, 2005, 13:45

From:

Dirk Elzinga <dirk.elzinga@...>

Date:

Saturday, September 10, 2005, 13:45

Hey. Yes, doing this sort of thing with Google Fight is cute, but I've been able to use the idea for serious research. I recently completed an article on English adjective comparison which investigates the choice between synthetic comparatives ('sillier') and analytical comparative constructions ('more silly'). The usual rules do a fairly good job, but there is variability that isn't/can't be explained in the accounts I've seen. So I applied an explicit and computationally implemented theory of analogy to the problem. But first, I needed to find the approximate ratios of synthetic/analytical comparatives in use. So I collected a list of almost 500 adjectives which were involved in a comparative construction of some kind. I then made two lists: one containing items of the form 'ADJ-er', the other containing items of the form 'more ADJ', and submitted them to Google for head-to-head comparisons. I then used the results to assign outcomes to each adjective. So for example, the analytical comparative construction 'more quiet' received 114,000 hits, while the comparative adjective 'quieter' received 31,400 hits. So the analytical outcome was assigned the adjective 'quiet'. I then used these outcomes in a software simulation and found that the computationally implemented analogical mechanism agreed with the Google results 92.1% of the time, which is better (slightly) than rule-based accounts. What is interesting though, is that the analogy algorithm also assigns probabilities to the outcomes which mirror actual probabilities seen in the Google searches (recall that while 'more quiet' was more common, 'quieter' did also receive a sizeable number of hits, so the choice of comparative constructions is inherently probabilistic). The article is forthcoming in the journal Lingua (probably early next year), so you'll all be able to see exactly what I found and why I think it's significant. Dirk On 9/10/05, David J. Peterson <dedalvs@...> wrote:

> Okay, so this is really stupid, but entertaining, nonetheless. > There's a website out there called Google Fight (googlefight.com), > and all it does is take two words or strings and sees which > one gets more hits on Google. So there are lots of fights you'd > expect (tastes great vs. less filling; republicans vs. democrats; > Coke vs. Pepsi, etc.) and also some odd ones (10.15 vs. 10.17). > So I decided to pit conlangs and auxlangs against each other, > for funsies. The results: > > conlang: 1,210,000 results > auxlang: 67,000 > > That's a huge margin. Anyway, some others: > > artlang: 122,000 > engelang: 523 > lostlang: 120 > model language: 122,000 > planned language: 16,600 > artificial language: 386 > created language: 18,800 > constructed language: 71,700 > conlanger: 11,600 > > Also, of my languages, apparently Kamakawi gets the most > mentions--almost twice as many as the runner up. Epiq gets > 52,300, but I think that's probably for some other reason. > Sathir gets a bunch of hits because of Everquest, apparently. > > Anyway, this barely warrants a message. I do think Google Fight > is cute, though. > > -David > ******************************************************************* > "A male love inevivi i'ala'i oku i ue pokulu'ume o heki a." > "No eternal reward will forgive us now for wasting the dawn." > > -Jim Morrison > > http://dedalvs.free.fr/ >

-- Gmail Warning: Watch the reply-to!