/[collab-maint]/deb-maint/pinot/trunk/TODO
ViewVC logotype

Contents of /deb-maint/pinot/trunk/TODO

Parent Directory Parent Directory | Revision Log Revision Log


Revision 12334 - (show annotations) (download)
Mon Feb 2 16:36:15 2009 UTC (4 years, 3 months ago) by hanska-guest
File size: 6613 byte(s)
New upstream release
1 Documentation
2 - List what files from libtextcat 2.2 go where ?
3 - Say where libtextcat 3.0 can (and cannot ;-) be found
4 - MIME type should be as returned by xdg-utils' 'xdg-mime query filetype ...'
5 - Try listing names of dependency packages for most distros
6 - How to trouble-shoot with delve, get to a file with filters and labels
7 - Explain when indexing and updating are done, eg in the results list, Index on
8 a result already indexed doesn't update it
9
10 General
11 - Fix the FIXMEs
12 - Get rid of dead code/classes/methods...
13 - Advertise service via Rendezvous
14 - Extend metadata beyond title,location,language,type,timestamp,size
15 - Don't package gmo files, they are platform dependent
16 - CLI programs to use tty highlighting if available
17
18 Barpanel
19 - Write a plugin for barpanel http://www.igelle.org/barpanel/
20
21 Tokenize
22 - Allow to cache documents that had to be converted ? eg PDF, MS Word
23 - Write a PDF filter that handles columns correctly, with poppler ?
24 - WordPerfect filter with libwpd
25 - Office filter with libgst
26 - TeX filter
27 - HtmlFilter to look for META tags Author, Creator, Publisher and CreationDate
28 - XmlFilter is slow-ish, rewrite file parsing with the TextReader interface
29 - Filters should at least return errno when they fail
30
31 SQL
32 - Move history files into the index directories
33
34 Monitor
35 - Implement support for Solaris FEM
36
37 Collect
38 - Comply with robot stuff defined at http://www.robotstxt.org/
39 - Harvest mode grabs all pages on a specific site down to a certain depth
40 - Make User-Agent string configurable
41 - Make download timeout configurable
42 - Support for HTML frames
43 - Test NeonDownloader
44
45 Search
46 - With engines that provide a redirection URL for results (eg Acoona), it looks like
47 the query hitory is not saved/checked correctly
48 - Make sure Description files' SyndicationRight is not private or closed
49 - getCloseTerms() should be a search engine method so that WebEngine can use plugins'
50 suggestions Url field (http://developer.mozilla.org/en/docs/Supporting_search_suggestions_in_search_plugins)
51 - Filters with CJKV should work better; supporting quoting would help, eg title:"你好"
52 - Check Mozdex plugin once it's back up
53 - Add a plugin for http://arxiv.org/find
54
55 Index
56 - Play around with the XAPIAN_FLUSH_THRESHOLD env var
57 - MD5 hash to determine on updates whether documents have changed, as done by omindex
58 - Allow to access remote Xapian indexes tunneled through ssh with xapian-progsrv,
59 and make sure ssh will ask passwords with /usr/libexec/openssh/ssh-askpass
60 - Index Nautilus metadata (http://linuxboxadmin.com/articles/nautilus.php)
61 - Reverse terms so that left wildcards can be applied ?
62 - XapianIndex could do with some common code refactoring
63 - Automatically categorize documents based on MIME type and source into picture, video, etc...
64 - After indexing or updating a document, a call to getDocumentInfo() shouldn't be necessary
65 - Labels and the rest of DocumentInfo are handled separately, they shouldn't be
66 - Indexes have no knowledge of indexId's
67 - Be ready to catch DatabaseModifiedError exceptions and reopen the index
68 - Think about security issues, especially when indexes are shared, based on http://plg.uwaterloo.ca/~claclark/fast2005.pdf
69
70 Mail
71 - Find out what kind of locking scheme Mozilla uses (POSIX lock ?) and use that
72 - Index Evolution email (Camel, might be useful for other types actually)
73 - Index mail headers
74 - Decypher and use Mozilla's mailbox scheme, eg
75 mailbox://mbox_file_name?number=2164959&part=1.2&type=text/plain&filename=portability.txt
76 - Keep track of attachments and avoid indexing the same file twice
77 - Mailboxes where all messages are flagged by Mozilla/that are empty are not indexed at all
78
79 Daemon
80 - Allow building without the daemon
81 - Enable to deactivate D-Bus interface
82 - Clean up method names
83 - Prefer ustring to string whenever possible
84 - Queue unindexing too
85 - Follow updates to Xesam specs
86 - Send a signal when crawling is done so that the UI can reopen the index
87 - The daemon should ask for permission before reindexing, especially if the corpus is large
88 - What does a first run mean for the daemon ? ie no configuration file
89 - Daemon should use worker threads' doWork() instead of duplicating code
90 - Only crawl newly added locations when the configuration changes
91
92 UI
93 - Show which threads are running, what they are doing, and allow to stop them
94 selectively
95 - Display search engines icons (Gtk::IconSource::set_filename() and Gtk::Style::render_icon())
96 - Replace glademm with libglademm ?
97 - Use unique (http://www.gnome.org/~ebassi/source/) if available
98 - Either Live Query behaves like a live query (eg results list updated when new
99 documents match) or it is renamed to something else to avoid confusion
100 - When viewing or indexing a result, all rows for that same URL should be updated with
101 the Viewed or Indexed icons (the latter after IndexingThread returns)
102 - Make use of GTKmm 2.10 StatusIcon
103 - Unknown exceptions in IndexingThread or elsewhere should be logged as errors
104 - Delete all temporary files when exiting
105 - Query expansion should be interactive
106 - Default cache provider should be configurable
107 - Offer to index newly mounted volumes
108 - UI doesn't show documents indexed by the daemon the very first time it's run,
109 at least until it's restarted
110 - Status dialog to show time of latest update
111 - Unique preferences
112 - Use gtk2 2.14's gtk_show_uri()
113 - Reload settings after preferences exit
114 - Changing set group by mode a few times will show index results under engine "xapian", why ?
115 - getIndexNames() to return ustring's
116 - Always call getIndexPropertiesByName() with a ustring, store engine names as ustring's
117
118 v0.90
119 - Filters should have a version number so that new versions only reindex documents
120 of the given type
121 - Queries should be cancellable
122 - Queries should return the top N results first, then the rest
123 - D-Bus (Simple)Query shouldn't let the bus connection time out before replying
124 - Live and stored queries shouldn't cap on the number of results but the number of results per page
125 - For each query group in the results list, show Next and Previous buttons to page through results
126 - Browse mode to be merged with the new search and page mode
127 - The query builder for stored queries should be available for live queries too
128 - PinotSettings and threads to be moved outside of UI
129 - CLI programs shouldn't require details about backends, should know indexes by name and
130 know their backends
131 - pinot-search should be able to run stored queries found in the configuration file
132 - pinot-index should be able to index directories recursively, as done by the daemon
133 - Command-line tools to work with relative paths
134

  ViewVC Help
Powered by ViewVC 1.1.5