<?xml version="1.0" encoding="UTF-8"?>

<rss version="2.0"
 xmlns:blogChannel="http://backend.userland.com/blogChannelModule"
>

<channel>
<title><![CDATA[Dobrica Pavlinušić's random unstructured stuff: Grep]]></title>
<link>https://saturn.ffzg.hr/rot13/index.cgi?grep</link>
<description></description>
<pubDate>Mon, 11 Jun 2007 09:24:59 -0000</pubDate>
<webMaster>root@saturn.ffzg.hr</webMaster>
<generator>Socialtext Workspace v2.19.0.2</generator>

<item>
<title><![CDATA[Grep]]></title>
<link>https://saturn.ffzg.hr/rot13/index.cgi?grep</link>
<description><![CDATA[<div>Creator: Dobrica Pavlinušić</div><hr/><div>Tags: Jifty, projects</div><hr/><div class="wiki">
<p>
Grep is RSS feed searcher and cacher. Actually, more correct is to think about Grep as your search console.<br />
If you are into buzzword land, you can even call that <em>information worker workbench</em> but, enough of that.</p>
<p>
It's mostly useful with web sites that provide search results as RSS feed. If they include full content in feeds even better. One example of such web service is <a target="_blank" title="(external link)" href="http://www.socialtext.com/">SocialText wiki<!-- wiki-renamed-hyperlink "SocialText wiki"<http://www.socialtext.com/> --></a> for which Grep was originally written.</p>
<blockquote>
I <strong>know</strong> that I wrote it in some wiki workspace, <strong>but where</strong> ?!</blockquote>
<br /><p>
Following that idea, Grep gained powerful plugin mechanism which enable users (err, developers who can write 10-line perl module) to scrape any site which has form or rest API and produce search results as links. While doing that, it will also fetch result pages and cache them locally. Have in mind that this is a slow process which puts much load on remote server, so use it sparsely. However, once remote results are fetched, they are always available in local cache for quick reference, even when offline!</p>
<p>
For now, here is a feature list:</p>
<ul>
<li>cache all results locally, great for off-line use</li>
<li>supports credential spoofing using Cookies (useful to login into protected areas)</li>
<li>comes with easy bookmarklet subscribe helper</li>
<li>written using <a target="_blank" title="(external link)" href="http://jifty.org/">Jifty<!-- wiki-renamed-hyperlink "Jifty"<http://jifty.org/> --></a> and <a target="_blank" title="(external link)" href="http://www.rectangular.com/kinosearch/">KinoSearch<!-- wiki-renamed-hyperlink "KinoSearch"<http://www.rectangular.com/kinosearch/> --></a></li>
<li>scrapers for other wikis and CMS engines which can be <a target="_blank" title="(external link)" href="http://svn.rot13.org/index.cgi/Grep/browse/lib/Grep/Source">source of items<!-- wiki-renamed-hyperlink "source of items"<http://svn.rot13.org/index.cgi/Grep/browse/lib/Grep/Source> --></a> for Grep</li>
<li>de-duplicate local results (based on <a target="_blank" title="(external link)" href="http://search.cpan.org/~janpom/Text-DeDuper/">near duplicates detection<!-- wiki-renamed-hyperlink "near duplicates detection"<http://search.cpan.org/~janpom/Text=-DeDuper/> --></a> which is great for sites which change just slightly like wikis)</li>
<li>import local html pages from <a target="_blank" title="(external link)" href="http://amb.vis.ne.jp/mozilla/scrapbook/">ScrapBook<!-- wiki-renamed-hyperlink "ScrapBook"<http://amb.vis.ne.jp/mozilla/scrapbook/> --></a> FireFox plugin</li>
</ul>
<p>
Source code is in <a target="_blank" title="(external link)" href="http://svn.rot13.org/index.cgi/Grep/">development Subversion repository<!-- wiki-renamed-hyperlink "development Subversion repository"<http://svn.rot13.org/index.cgi/Grep/> --></a></p>
<div class="nlw_phrase"><div class="fetchrss_box">
  <div class="fetchrss_titlebox">
    
 <div class="fetchrss_title">
     fetchrss: http://svn.rot13.org/index.cgi/Grep/rss
 </div>

  </div>
  
<ul class="fetchrss_item">
  <li class="fetchrss_item">
    There was an error: 404 Not Found

  </li>
</ul>

</div>
















<!-- wiki: {fetchrss: http://svn.rot13.org/index.cgi/Grep/rss full}
--></div><br /></div>
]]></description>
<author>Dobrica Pavlinu&#x161;i&#x107;</author>
<category>Jifty, projects</category>
<guid isPermaLink="true">https://saturn.ffzg.hr/rot13/index.cgi?grep</guid>
<pubDate>Mon, 11 Jun 2007 09:24:59 -0000</pubDate>
</item>
</channel>
</rss>