[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[debian-devel:11905] Re: New Search Engine?
Hi.
In article <20000316175118.K22023@xxxxxxxxxx>,
at Thu, 16 Mar 2000 17:51:18 -0800,
on Re: New Search Engine?,
"Darren O. Benham" <gecko@debian.org> writes:
> Baring problems from -admin, I'd suggest running it on master.. but don't
> link it into the web pages...
>
> Here's two thoughts:
>
> 1) Do the indices get created incrementally or are the indices get
> recreated each run?
>
> 2) What are the size of the indices going to be?
There was a thread about using namazu (namazu2) for the search of
Web pages on this (debian-www) list (task-split to several machines
can be done also).
# namazu are debian-packaged and maintained by kitame@debian.org
namazu can do incremental indexing, and index creation can be done
at other machines from the web server.
NOKUBI (knok@debian.or.jp) wrote:
| At first, re-indexing is not heavier than first time indexing. It is
| "difference indexing". Untouched files are not targets of proccessing.
|
| Second, namazu/namazu2 can handle multiple index files. So some index
| processing can divide (and could be use some machines).
Requird Time to create initial index for namazu:
BTS (www.jp.debian.org/Bugs/)
Size (bytes): 139,887,588
Total Documents: 16,748
Total Keywords: 1,352,486
Time (sec): 12,566
File/Sec: 1.33
debian-devel (Debian Project, www.jp.debian.org/Lists-Archives/debian-devel-nnnn)
Size (bytes): 281,988,180
Total Documents: 60,399
Total Keywords: 579,385
Time (sec): 16,163
File/Sec: 3.74
debian-user (Debian Project, www.jp.debian.org/Lists-Archives/debian-user-nnnn)
Size (bytes): 363,076,366
Total Documents: 89,959
Total Keywords: 743,521
Time (sec): 27,283
File/Sec: 3.30
debian-users-jp (Debian JP Project, www.debian.or.jp/Lists-Archives/debian-users/)
Size (bytes): 90,384,962
Total Documents: 20,800
Total Keywords: 413,239
Time (sec): 5,805
File/Sec: 3.58
debian-devel-jp (Debian JP Project, www.debian.or.jp/Lists-Archives/debian-devel/)
Size (bytes): 54,418,491
Total Documents: 11,642
Total Keywords: 328,629
Time (sec): 3,062
File/Sec: 3.80
KITAME (kitame@debian.org) wrote:
| I do re-indexing at every 04:00 JST. Please check
|
| http://sakura.debian.or.jp/~kitame/mknmz.logs/
| http://sakura.debian.or.jp/~kitame/mknmz.logs/summary/
|
| These files are updated by every re-indexing
Currently, our server (sakura.debian.or.jp) have some hardware trouble,
but I hope we get new machine several week later, and then we can
provide the required index for search on Debian's Web pages, I think.
--
Taketoshi Sano: <sano@debian.org>,<sano@debian.or.jp>,<kgh12351@xxxxxxxxxxx>