<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Guyub - Konsultan F/OSS &#187; Pemrograman</title>
	<atom:link href="http://guyub.co.id/category/pemrograman/feed/" rel="self" type="application/rss+xml" />
	<link>http://guyub.co.id</link>
	<description>GNU/Linux - Java, PHP, Ruby - MySQL, PostgreSQL</description>
	<lastBuildDate>Thu, 29 Jul 2010 10:11:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>QA#4: Java EE 6: Developers focus on business logic, Much lower TCO &#8211; by Johan Vos</title>
		<link>http://guyub.co.id/qa4-java-ee-6-developers-focus-on-business-logic-much-lower-tco-by-johan-vos/</link>
		<comments>http://guyub.co.id/qa4-java-ee-6-developers-focus-on-business-logic-much-lower-tco-by-johan-vos/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 14:43:43 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[business login]]></category>
		<category><![CDATA[developers]]></category>
		<category><![CDATA[java]]></category>

		<guid isPermaLink="false">http://guyub.co.id/qa4-java-ee-6-developers-focus-on-business-logic-much-lower-tco-by-johan-vos/</guid>
		<description><![CDATA[
Content available at: http://blogs.sun.com/arungupta/entry/qa_4_java_ee_6

]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.java.net/blog/arungupta/archive/2010/07/28/qa4-java-ee-6-developers-focus-business-logic-much-lower-tco-johan"><!--  | 0 -->
<p>Content available at: <a href="http://blogs.sun.com/arungupta/entry/qa_4_java_ee_6">http://blogs.sun.com/arungupta/entry/qa_4_java_ee_6</a></p>
<p></a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/qa4-java-ee-6-developers-focus-on-business-logic-much-lower-tco-by-johan-vos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PHP for Android, PHP 6 canceled, APC in PHP 5.4</title>
		<link>http://guyub.co.id/php-for-android-php-6-canceled-apc-in-php-5-4-lately-in-php-podcast-episode-3-php-classes/</link>
		<comments>http://guyub.co.id/php-for-android-php-6-canceled-apc-in-php-5-4-lately-in-php-podcast-episode-3-php-classes/#comments</comments>
		<pubDate>Sun, 25 Jul 2010 18:39:44 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[GNU/Linux]]></category>
		<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[android]]></category>
		<category><![CDATA[apc]]></category>
		<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://guyub.co.id/php-for-android-php-6-canceled-apc-in-php-5-4-lately-in-php-podcast-episode-3-php-classes/</guid>
		<description><![CDATA[

PHP for Android, PHP 6 canceled, APC in PHP 5.4 &#8211; Lately in PHP podcast episode 3
By Manuel Lemos
On this episode of the Lately in PHP podcast, Manuel Lemos and Ernani Joppert comment on the launch of the PHP for Android project and the consequences for the PHP market.
They also talk about the cancellation of [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.phpclasses.org/blog/post/126-PHP-for-Android-PHP-6-canceled-APC-in-PHP-54--Lately-in-PHP-podcast-episode-3.html">
<div style="clear: both">
<div style="margin-top: 1ex"><a href="http://www.phpclasses.org/blog/post/126-PHP-for-Android-PHP-6-canceled-APC-in-PHP-54--Lately-in-PHP-podcast-episode-3.html">PHP for Android, PHP 6 canceled, APC in PHP 5.4 &#8211; Lately in PHP podcast episode 3</a></div>
<div style="margin-top: 1ex">By Manuel Lemos</a></div>
<div style="margin-top: 1ex">On this episode of the Lately in PHP podcast, Manuel Lemos and Ernani Joppert comment on the launch of the PHP for Android project and the consequences for the PHP market.</p>
<p>They also talk about the cancellation of PHP 6 and the inclusion of features planned for PHP 6 in PHP 5.4, like the integration of the APC cache extension in the main PHP distribution bundle.</p>
<p>Some of the most interesting classes nominated for the May edition of the PHP Programming Innovation Award are commented, like the PDF text extract, PHP duplicate files finder, Fast Fourier Transform and splx_graph.</a></div>
</div>
<p></a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/php-for-android-php-6-canceled-apc-in-php-5-4-lately-in-php-podcast-episode-3-php-classes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JavaOne News Update 1</title>
		<link>http://guyub.co.id/javaone-news-update-1/</link>
		<comments>http://guyub.co.id/javaone-news-update-1/#comments</comments>
		<pubDate>Sun, 25 Jul 2010 17:46:52 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[javaone]]></category>

		<guid isPermaLink="false">http://guyub.co.id/javaone-news-update-1/</guid>
		<description><![CDATA[
An update on some recent News on
JavaOne 2010.
As you know
JavaOne San Francisco is Sep 19-23, 2010.
The
Official page
has links to the
Registration Page
and the
Online Catalog.
News updates include:
&#8226;
A surprisingly useful &#038; manageable Catalog-as-tweets
via
@javaoneconf
&#8226;
Availability of
Schedule Builder (post)
&#8226;
Open enrollment in
Java University (post)
&#8226;
Announcement of dates for JavaOne Brazil and JavaOne China (post).
&#8226; The day before there is a
MySQL Sunday!
&#8226; And, the
Duke [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.sun.com/theaquarium/entry/javaone_news_update_1"><br />
An update on some recent News on<br />
JavaOne 2010.<br />
As you know<br />
JavaOne San Francisco is Sep 19-23, 2010.<br />
The<br />
Official page<br />
has links to the<br />
Registration Page<br />
and the<br />
Online Catalog.<br />
News updates include:</p>
<p>&bull;<br />
A surprisingly useful &#038; manageable Catalog-as-tweets<br />
via<br />
@javaoneconf</p>
<p>&bull;<br />
Availability of<br />
Schedule Builder (post)</p>
<p>&bull;<br />
Open enrollment in<br />
Java University (post)</p>
<p>&bull;<br />
Announcement of dates for JavaOne Brazil and JavaOne China (post).</p>
<p>&bull; The day before there is a<br />
MySQL Sunday!</p>
<p>&bull; And, the<br />
Duke Awards<br />
submissions page seems to still be active.</p>
<p>Also, this year will be the 15th anniversary for Java, and the 5th for GlassFish.&nbsp; Don&#8217;t know if there will be a BDay party for Java; still hoping we can put something together for GlassFish, we will see!</p>
<p>More related news are tagged<br />
JavaOne.<br />
</a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/javaone-news-update-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Rails on PostgreSQL: Pivotal Labs Talk &#8211; Scaling a Rails App with Postgres</title>
		<link>http://guyub.co.id/rails-on-postgresql-pivotal-labs-talk-scaling-a-rails-app-with-postgres/</link>
		<comments>http://guyub.co.id/rails-on-postgresql-pivotal-labs-talk-scaling-a-rails-app-with-postgres/#comments</comments>
		<pubDate>Fri, 23 Jul 2010 18:01:00 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Basisdata]]></category>
		<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[postgresql]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[scaling]]></category>

		<guid isPermaLink="false">http://guyub.co.id/rails-on-postgresql-pivotal-labs-talk-scaling-a-rails-app-with-postgres/</guid>
		<description><![CDATA[I&#8217;m slowly catching up with my podcast backlog and came across a Pivotal Labs talk from May 2009.  In this talk Josh Susser and Damon McCormick are presenting on Scaling a Rails App with Postgres .  It&#8217;s a little dated now &#8211; this talk was given was when PostgreSQL 8.4 was in beta [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://railsonpostgresql.com/2010/07/23/pivotal-labs-talk-scaling-a-rails-app-with-postgres">I&#8217;m slowly catching up with my podcast backlog and came across a Pivotal Labs talk from May 2009.  In this talk Josh Susser and Damon McCormick are presenting on <a href="http://pivotallabs.com/talks/67-scaling-a-rails-app-with-postgres">Scaling a Rails App with Postgres</a> .  It&#8217;s a little dated now &#8211; this talk was given was when PostgreSQL 8.4 was in beta &#8211; but, still, lots of good stuff.  Here are some notes:</p>
<ul>
<li>They started with an existing Rails app with lots of data, so they had some constraints &#8211; not greenfield development.</li>
<li>Around the 5-6 minute mark there&#8217;s a good discussion of PostgreSQL&#8217;s query optimizer and how it analyzes a table&#8217;s data distribution.  One takeaway (mentioned around 16:20) is to run <code>vacuum</code> more often on a particular table if there are a lot of writes.</li>
<li>10:00 How to set STATISTICS for a particular table.</li>
<li>11:00 Using partial indexes.</li>
<li>14:00 Indexing on expressions.</li>
<li>18:10-23:00 A nice discussion of the <code>EXPLAIN</code> output.</li>
<li>23:45 Here they talk about wide columns.  I&#8217;ve seen this in MySQL as well, where splitting text data out into a separate table yielded some good speedups.</li>
<li>26:10 Some discussion of <code>pg_bench</code>.</li>
<li>35:30 How long does it take to add an index to large tables?  They saw times of up to an hour for tables with millions of rows.</li>
<li>36:30 clustering your data in order to get PostgreSQL to write it more efficiently.</li>
<li>37:30-48:00 A thorough discussion of partitioning tables via table inheritance.  They used an ActiveRecord model (39:23) with a bunch of utility methods.  They also had a cron to periodically create new partitions.  At 45:15 they make a nice distinction between using partial indexes and partitions &#8211; one advantage is that a partition&#8217;s indexes can be different than its parents indexes.  At 49:00 they mention maybe doing a plugin, not sure if that happened.</li>
<li>52:00 Some discussion of full text search via <code>tsearch</code>.</li>
<li>53:00 PostgreSQL&#8217;s lack of built in replication outside of WAL shipping, Slony, etc.  Thank goodness 9.0 will address this!</li>
<li>54:00 Some props to <a href="http://www.engineyard.com/">Engine Yard</a> on their PostgreSQL support.</li>
</ul>
<p>Good stuff all around, and thanks to Pivotal for posting these great talks!</p>
<p></a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/rails-on-postgresql-pivotal-labs-talk-scaling-a-rails-app-with-postgres/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ketan Padegaonkar: Code Complexity Visualization for Ruby</title>
		<link>http://guyub.co.id/ketan-padegaonkar-code-complexity-visualization-for-ruby/</link>
		<comments>http://guyub.co.id/ketan-padegaonkar-code-complexity-visualization-for-ruby/#comments</comments>
		<pubDate>Tue, 20 Jul 2010 20:33:05 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Aplikasi]]></category>
		<category><![CDATA[F/OSS]]></category>
		<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[eclipse]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://guyub.co.id/ketan-padegaonkar-code-complexity-visualization-for-ruby/</guid>
		<description><![CDATA[

Only Valid Measure of Code Quality

Image from http://www.osnews.com/story/19266/WTFs_m
WTF implies lack of clarity. Clear code is easier to understand, easier to maintain and easier to extend.
Announcing saikuro_treemap ? an easy to setup tool to generate complexity treemaps of ruby code.
See a demo for yourself.

Complexity Visualization of Rake


]]></description>
			<content:encoded><![CDATA[<p><a href="http://ketan.padegaonkar.name/2010/07/20/code-complexity-visualization-for-ruby.html">
<div class="wp-caption alignnone" id="attachment_451" style="width: 510px;"><a href="http://ketan.padegaonkar.name/files/2010/06/wtfm.jpg"><img alt="Only Valid Measure of Code Quality" class="size-full wp-image-451" height="471" src="http://ketan.padegaonkar.name/files/2010/06/wtfm.jpg" width="500" /></a>
<p class="wp-caption-text">Only Valid Measure of Code Quality</p>
</div>
<p>Image from http://www.osnews.com/story/19266/WTFs_m</p>
<p>WTF implies lack of clarity. Clear code is easier to understand, easier to maintain and easier to extend.</p>
<p>Announcing <a href="http://github.com/ThoughtWorksStudios/saikuro_treemap">saikuro_treemap</a> ? an easy to setup tool to generate complexity treemaps of ruby code.</p>
<p>See a <a href="http://thoughtworksstudios.github.com/rake.ccn.html">demo</a> for yourself.</p>
<div class="wp-caption alignnone" id="attachment_454" style="width: 310px;"><a href="http://thoughtworksstudios.github.com/rake.ccn.html"><img alt="" class="size-medium wp-image-454" height="186" src="http://ketan.padegaonkar.name/files/2010/07/rake.ccn_-300x186.png" width="300" /></a>
<p class="wp-caption-text">Complexity Visualization of Rake</p>
</div>
<p></a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/ketan-padegaonkar-code-complexity-visualization-for-ruby/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tomasz Wegrzanowski: We need syntax for talking about Ruby types</title>
		<link>http://guyub.co.id/tomasz-wegrzanowski-we-need-syntax-for-talking-about-ruby-types/</link>
		<comments>http://guyub.co.id/tomasz-wegrzanowski-we-need-syntax-for-talking-about-ruby-types/#comments</comments>
		<pubDate>Tue, 20 Jul 2010 11:00:11 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[syntax]]></category>

		<guid isPermaLink="false">http://guyub.co.id/tomasz-wegrzanowski-we-need-syntax-for-talking-about-ruby-types/</guid>
		<description><![CDATA[

All this is about discussing types in blog posts, documentation etc. None of that goes anywhere near actual code (except possibly in comments). Ruby never sees that. 
Statically typed languages have all this covered, and we need it too. Not static typing of course &#8211; just an expressive way to talk about what types things [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://t-a-w.blogspot.com/2010/07/we-need-syntax-for-talking-about-ruby.html">
<div class="separator"><a href="http://3.bp.blogspot.com/_IYGc_MWwkfw/TEWDHlH1PMI/AAAAAAAAA_U/585PGD67q4Y/s1600/koteczek_by_kemcio_from_flickr_cc-nc.jpg" title="Koteczek by kemcio from flickr (CC-NC)"><img alt="Koteczek by kemcio from flickr (CC-NC)" border="0" height="480" src="http://3.bp.blogspot.com/_IYGc_MWwkfw/TEWDHlH1PMI/AAAAAAAAA_U/585PGD67q4Y/s640/koteczek_by_kemcio_from_flickr_cc-nc.jpg" width="640" /></a></div>
<p><b>All this is about discussing types in blog posts, documentation etc. None of that goes anywhere near actual code (except possibly in comments). Ruby never sees that.</b> </p>
<p>Statically typed languages have all this covered, and we need it too. Not static typing of course &#8211; just an expressive way to talk about what types things are &#8211; as plain English fails here very quickly. As far as I know nothing like that exists yet, so here&#8217;s my proposal.</p>
<p>This system of type descriptions is meant for humans, not machines. It focuses on the most important distinctions, and ignores details that are not important, or very difficult to keep track of. Type descriptions should only be as specific as necessary in given context. If it makes sense, there rules should be violated.</p>
<p>In advance I&#8217;ll say I totally ignored all the covariance / contravariance / invariance business &#8211; it&#8217;s far to complicated, and getting too deeply into such issues makes little sense in a language where everything can be redefined.<br />
<h3>Basic types</h3>
<p>Types of simple values can be described by their class name, or any of its superclasses or mixins. So some ways to describe type of <tt>15</tt> would be <tt>Fixnum</tt> (actual class), <tt>Integer</tt> (superclass), <tt>Comparable</tt> (mixin), or <tt>Object</tt> (superclass all the way up).</p>
<p>In context of describing types, everything is considered an <tt>Object</tt>, and existence of <tt>Kernel</tt>, <tt>BasicObject</tt> etc. is ignored.</p>
<p>So far, it should all be rather obvious. Examples:
<ul>
<li><tt>42</tt> &#8211; <tt>Integer</tt></li>
<li><tt>Time.now</tt>&nbsp; &#8211; <tt>Time</tt></li>
<li><tt>Dir.glob("*")</tt> &#8211;  <tt>Enumerable</tt></li>
<li><tt>STDIN</tt> &#8211; <tt>IO</tt></li>
</ul>
<h3><tt>nil</tt> and other ignored issues</h3>
<p><tt>nil</tt> will be treated specially &#8211; as if it was of every possible type. <tt>nil</tt> means absence of value, and doesn&#8217;t indicate what type the value would have if it was present. This is messy, but most explicitly typed languages follow this path.</p>
<p>Distinction between situations that allow <tt>nil</tt>s and those that don&#8217;t will be treated as all other value range restrictions (<tt>Integer</tt> must be posibile, <tt>IO</tt> must be open for writing etc.) &#8211; as something outside the type system.</p>
<p>For cases where <tt>nil</tt> means something magical, and not just absence of value, it should probably be mentioned.</p>
<p>Checked exceptions and related non-local exits in Ruby would be a hopeless thing to even attempt. There&#8217;s syntax for exceptions and catches used as control structures if they&#8217;re really necessary.</p>
<p>
<h3>Booleans</h3>
<p>We will also pretend that <tt>Boolean</tt> is a common superclass of <tt>TrueClass</tt> and <tt>FalseClass</tt>.</p>
<p>We will also normally ignore distinction between situations where  real <tt>true</tt>/<tt>false</tt> are expected,  and situations where any object goes, but acts identically to its  boolean conversion. Any method that acts identically on <tt>x</tt> and <tt>!!x</tt> can be said to take <tt>Boolean</tt>.</p>
<p>On the other hand if some values are treated differently than their double negation, that&#8217;s not really <tt>Boolean</tt> and it deserves a mention. Especially if <tt>nil</tt> and <tt>false</tt> are not equivalent &#8211; like in Rails&#8217;s <tt>#in_groups_of</tt> (I don&#8217;t think Ruby stdlib  ever does thing like that).</p>
<h3>Duck typing</h3>
<p>If something quacks like a <tt>Duck</tt> convincingly  enough, it can be said to be of type <tt>Duck</tt>, it being  object&#8217;s responsibility that its cover doesn&#8217;t get blown.</p>
<p>In particular, Ruby uses certain methods for automatic type conversion. In many contexts objects implementing <tt>#to_str</tt> like <tt>Pathname</tt>s will be treated as <tt>String</tt>s, objects implementing <tt>#to_ary</tt> as  <tt>Array</tt>s, <tt>#to_hash</tt> as <tt>Hash</tt>es, and <tt>to_proc</tt> as <tt>Proc</tt>s &#8211; this can be used for some amazing things like <tt>Symbol#to_proc</tt>.</p>
<p>This leads to a big complication for us &#8211; C code implementing Ruby interpreter and many libraries is normally written in a way that calls these conversion functions automatically, so in such contexts <tt>Symbol</tt> really is a <tt>Proc</tt>, <tt>Pathname</tt> really is a <tt>String</tt> and so on. On the other hand, in Ruby code these methods are not magical, and such conversions will only happen if explicitly called &#8211; for them <tt>Pathname</tt> and <tt>String</tt> are completely unrelated types. Unless Ruby code calls C code, which then autoconverts.</p>
<p>Explicitly differentiating between contexts which expect a genuine <tt>String</tt> and those which expect either that or something with a valid <tt>#to_str</tt> method would be highly tedious, and I doubt anyone would get it exactly right.</p>
<p>My recommendation would be to treat everything that autoconverts to something as if it subclassed it. So we&#8217;ll pretend <tt>Pathname</tt> is a subclass of <tt>String</tt>, even though it&#8217;s not really. In some cases this will be wrong, but it&#8217;s not really all that different from subclassing something and then introducing incompatible changes. </p>
<p>This all doesn&#8217;t extend to <tt>#to_s</tt>,  <tt>#to_a</tt> etc &#8211; nothing can be described as  <tt>String</tt> just because it has  <tt>to_s</tt> method &#8211; every object has  <tt>to_s</tt> but most aren&#8217;t really strings.</p>
<h3>Technical explanation of <tt>to_str</tt> and friends</h3>
<p><i>This section is unrelated to post&#8217;s primary subject &#8211; skip if uninterested.</i></p>
<p>Ruby uses special memory layout for basic types like strings and arrays. Performance would be abysmal if string methods had to actually call Ruby code associated with whatever <tt>[]</tt> happened to be redefined to for every character &#8211; instead they ask for a certain C data structure, and access that directly (via some macros providing extra safety and convenience to be really exact).</p>
<p>By the way this is a great example of C being really slow &#8211; if Ruby was implemented on a platform with really good JIT, it could plausibly have every single string function implemented in term of calls to <tt>[]</tt>, <tt>[]=</tt>, <tt>size</tt>, and just a few others, with different subclasses of <tt>String</tt> providing different implementations, and JIT compiling inlining all that to make it really fast.</p>
<p>It would make it really simple to create class representing a text file, and <tt>=~ /regexp/</tt> that directly without reading anything more than required to memory, or maybe even <tt>gsub!</tt> it in a way that would read it in small chunks, saving them to another file as soon as they&#8217;re ready, and then renaming in one go. All that without regexp library knowing anything about it all. It&#8217;s all just my fantasy, I&#8217;m not saying any such JIT actually exists.</p>
<p>Anyway, strings and such are implemented specially, but we still want these types to be real objects, not like what they&#8217;ve done in Java. To make it work, all C functions requiring access to underlying storage call a special macro which automatically calls a method like <tt>to_str</tt> or <tt>to_ary</tt> if necessary &#8211; so such objects can pretend to be strings very effectively. For example if you alias method <tt>to_str</tt> to <tt>path</tt> on <tt>File</tt> code like <tt>system File.open("/bin/hostname")</tt> will suddenly start working. It really makes sense only for things which are &#8220;essentially strings&#8221; like <tt>Pathname</tt>, <tt>URI</tt>, Unicode-enhanced strings, proxies for strings in third party libraries like Qt etc.</p>
<p>To complicate things further objects of all classes inheriting from <tt>String</tt> automatically use <tt>String</tt>&#8217;s data representation &#8211; and C code will access that, never calling <tt>to_str</tt>. This leaves objects which duck type as <tt>String</tt>s two choices:
<ul>
<li>Subclass <tt>String</tt> and every time anything changes update C string data. This can be difficult &#8211; if you implement an <tt>URI</tt> and keep query part as a hash instance variable &#8211; you need to somehow make sure that your update code gets run every time that hash changes &#8211; like by not exposing it at all and only allowing query updates via your direct methods, or wrapping it in a special object that calls you back.</li>
<li>Don&#8217;t subclass <tt>String</tt>, define <tt>to_str</tt> the way you want. Everything works &#8211; except your class isn&#8217;t technically a <tt>String</tt> so it&#8217;s not terribly pretty OO design.</li>
</ul>
<p>You probably won&#8217;t be surprised that not subclassing is the more popular choice. As it&#8217;s all due to technical limitations not design choices, it makes sense to treat such objects as if they were properly subclassed.</p>
<div class="separator"><a href="http://1.bp.blogspot.com/_IYGc_MWwkfw/TEWDOvo38GI/AAAAAAAAA_c/_i25a_PeveU/s1600/pussy_by_tripleigrek_from_flickr_cc-sa.jpg" title="Pussy by tripleigrek from flickr (CC-SA)"><img alt="Pussy by tripleigrek from flickr (CC-SA)" border="0" height="480" src="http://1.bp.blogspot.com/_IYGc_MWwkfw/TEWDOvo38GI/AAAAAAAAA_c/_i25a_PeveU/s640/pussy_by_tripleigrek_from_flickr_cc-sa.jpg" width="640" /></a></div>
<p>
<h3>Collections</h3>
<p>Back to the subject. For collections we often want to describe types of their elements. For simple collections yielding successive elements on <tt>#each</tt>, syntax for type description is <tt>CollectionType[MemberType]</tt>. Examples:
<ul>
<li><tt>[42.0, 17.5]</tt> &#8211; <tt>Array[Float]</tt> </li>
<li><tt>Set["foo","bar"]</tt> &#8211; <tt>Set[String]</tt></li>
<li><tt>5..10</tt> &#8211; <tt>Range[Integer]</tt></li>
</ul>
<p>When we don&#8217;t care about collection type, only about element types, descriptions like <tt>Enumerable[ElementType]</tt> will do.</p>
<p>Syntax for types of hashtables is <tt>Hash[KeyType, ValueType]</tt> &#8211; in general collections which yield multiple values to <tt>#each</tt> can be described as <tt>CollectionType[Type1, Type2, ..., TypeN]</tt>.</p>
<p>For example <tt>{:foo =&gt; "bar"}</tt> is of type <tt>Hash[Symbol, String]</tt>.</p>
<p>This is optional &#8211; type descriptions like <tt>Hash</tt> or  <tt>Enumerable</tt> are perfectly valid &#8211; and often types  are unrelated, or we don&#8217;t care.</p>
<p>Not every <tt>Enumerable</tt> should be treated as collection of members like that &#8211; <tt>File</tt> might technically be <tt>File[String]</tt> but it&#8217;s usually pointless to describe it this way. In 1.8 <tt>String</tt> is <tt>Enumerable</tt>, yielding successive lines when iterated &#8211; but <tt>String[String]</tt> make no sense (no longer a problem in 1.9).</p>
<p>Classes other than <tt>Enumerable</tt> like  <tt>Delegator</tt> might need type parameters, and they should be specified with the same syntax. Their order and meaning depends on particular class, but usually should be obvious.</p>
<h3>Literals and tuples</h3>
<p>Ruby doesn&#8217;t make distinction between <tt>Array</tt>s and tuples. What I mean here is a kind of <tt>Array</tt> which shouldn&#8217;t really be treated as a collection, and in which different members have unrelated type and meaning depending on their position.</p>
<p>Like method arguments. It really wouldn&#8217;t be useful to say that every method takes <tt>Array[Object]</tt> (and an optional <tt>Proc</tt>) &#8211; types and meanings of elements in this array should be specified.</p>
<p>Syntax I want for this is <tt>[Type1, Type2, *TypeRest]</tt> &#8211; so for example <tt>Hash[Date, Integer]</tt>&#8217;s <tt>#select</tt> passes <tt>[Date, Integer]</tt> to the block, which should return a <tt>Boolean</tt> result, and then returns either <tt>Array[[Date, Integer]]</tt> (1.8) or <tt>Hash[Date, Integer]</tt> (1.9). Notice double <tt>[[]]s</tt> here &#8211; it&#8217;s an <tt>Array</tt> of pairs. In many contexts Ruby automatically unpacks such tuples, so <tt>Array[[Date,Integer]]</tt> can often be treated as <tt>Array[Date,Integer]</tt> &#8211; but it doesn&#8217;t go deeper than one level, and if you need this distinction it&#8217;s available. </p>
<p>Extra arguments can be specified with <tt>*Type</tt> or <tt>...</tt> which is treated here as <tt>*Object</tt>. If you want to specify some arguments as optional suffix their types with <tt>?</tt> (the most obvious <tt>[]</tt> having too many uses already, and <tt>=</tt> not really fitting right).</p>
<p>In this syntax <tt>[*Foo]</tt> is pretty much equivalent to <tt>Array[Foo]</tt>, or possibly <tt>Enumerable[Foo]</tt> (with some duck typing) &#8211; feel free to use that if it makes things clearer.</p>
<p>Basic literals like <tt>true</tt>, <tt>false</tt>, <tt>nil</tt> stand for themselves &#8211; and for entire <tt>TrueClass</tt>, <tt>FalseClass</tt>,  <tt>NilClass</tt> classes too as they&#8217;re their only members. Other literals such as symbols, strings, numbers etc. can be used too when needed.</p>
<p>To describe keyword arguments and hashes used in similar way, syntax is <tt>{Key1=&gt;Type1, Key2=&gt;Type2}</tt> &#8211; specifying exact key, and type of value like <tt>{:noop=&gt;Boolean, :force=&gt;Boolean}</tt>.</p>
<p>It should be assumed that keys other than those listed are ignored, cause exception, or are otherwise not supported. If they&#8217;re meaningful it should be marked with <tt>...</tt> like this <tt>{:query=&gt;String, ...}</tt>. Subclasses often add extra keyword arguments, and this issue is ignored.</p>
<h3>Functions</h3>
<p>Everything so far was just a prelude to the most important part of any type system &#8211; types for functions. Syntax I&#8217;d propose it: <tt>ArgumentTypes -&gt; ReturnType</tt> (<tt>=&gt;</tt> being already used by hashes).</p>
<p>I cannot decide if blocks should be specified in Ruby-style notation or a function notation, so both&nbsp; <tt>&amp; {|BlockArgumentTypes| BlockReturnType}</tt> and <tt>&amp;(BlockArgumentTypes-&gt;BlockReturnType)</tt> are valid. <tt>&amp;</tt> is necessary, as block are passed separately from normal arguments, however strong the temptation to reuse <tt>-&gt;</tt> and let the context disambiguate might be.</p>
<p>Blocks that don&#8217;t take any arguments or don&#8217;t return anything can drop that part, leaving only something like <tt>&amp;{|X|}</tt>, <tt>&amp;{Y}</tt>, <tt>&amp;{}</tt>, or in more functional notation <tt>&amp;(X-&gt;)</tt>, <tt>&amp;(Y)</tt>, <tt>&amp;()</tt>.</p>
<p>Because of all the <tt>[]</tt> unpacking, using  <tt>[]</tt> around arguments, tuple return values etc. is  optional &#8211; and just like in Ruby <tt>()</tt> can be used  instead in such contexts.</p>
<p>If function doesn&#8217;t take any arguments, or returns no values, these parts can be left &#8211; leaving perhaps as little as <tt>-&gt;</tt>. </p>
<p>Examples:
<ul>
<li>In context of <tt>%w[Hello world !].group_by(&amp;:size)</tt> method <tt>#group_by</tt> has type <tt>Array[String]&amp;{|String| Integer}-&gt;Hash[Integer,String]</tt></li>
<li><tt>Time.at</tt> has type <tt>Numeric -&gt; Time</tt></li>
<li><tt>String#tr</tt> has type <tt>[String, String] -&gt; String</tt></li>
<li>On a collection of <tt>Float</tt>s, <tt>#find</tt> would have type <tt>Float?&amp;(Float-&gt;Boolean)-&gt;Float</tt></li>
<li>Function which takes no arguments and returns no values has type <tt>[]-&gt;nil</tt></li>
</ul>
<p>If you really need to specify exceptions and throws, you can add <tt>raises Type</tt>, or <tt>throws :kind</tt> after return value.&nbsp; Use only for control structure exceptions, not for actual errors exceptions. It might actually be useful if actual data gets passed around.
<ul>
<li><tt>Find.find</tt> has type <tt>[String*]&amp;(String-&gt;nil throws :prune)-&gt;nil</tt></li>
</ul>
<p>A standalone <tt>Proc</tt> can be described as <tt>(ArgumentsTypes-&gt;ReturnType)</tt> just as with notation for functions. There is no ambiguity between <tt>Proc</tt> arguments and block arguments, as blocks are always marked with <tt>|</tt>. </p>
<h3>Type variable and everything else</h3>
<p>In addition to names of real classes, any name starting with an uppercase letter should be consider a type. Unless it&#8217;s specified otherwise in context, all such unknown&nbsp; names should be considered class variables with big forall quantifier in front of it all.</p>
<p>Examples:
<ul>
<li><tt>Enumerable[A]#partition</tt> has type <tt>&amp;(B-&gt;Boolean)-&gt;[Array[A], Array[A]]</tt></li>
<li><tt>Hash[A,B]#merge</tt> has type <tt>Hash[A,B]&amp;(A,B,B-&gt;B)-&gt;Hash[A,B]</tt></li>
<li><tt>Array[A]#inject</tt> has either type <tt>B&amp;(B,A-&gt;B)-&gt;B</tt> or <tt>&amp;(A,A)-&gt;A</tt>. This isn&#8217;t just a usual case of missing argument being substituted by <tt>nil</tt> &#8211; these are two completely different functions.</li>
</ul>
<p>To specify that multiple types are allowed (usually implying that behaviour will be different, otherwise there should be a superclass somewhere, or we could treat it as common duck typing and ignore it) join them with <tt>|</tt>. If there&#8217;s ambiguity between this use and block arguments, parenthesize. It binds more tightly than <tt>,</tt>, so it only applies to one argument. Example:
<ul>
<li><tt>String#index</tt> in 1.8 has type <tt>(String|Integer|Regexp, Integer?)-&gt;Integer</tt> (and notice how I ignored <tt>Fixnum</tt>s here).</li>
</ul>
<p>For functions that can be called in multiple unrelated ways, just list them separately &#8211; <tt>|</tt> and parentheses will work, but they are usually top level, and not needed anywhere deeper.</p>
<p>If you want to specify type of <tt>self</tt>, prefix  function specification with <tt>Type#</tt>:
<ul>
<li><tt>#sort</tt> has type like  <tt>Enumerable[A]#()&amp;(A,A-&gt;1|0|-1)-&gt;Array[A]</tt></li>
</ul>
<p>To specify that something takes range of values not really corresponding to a Ruby class, just define such extra names somewhere and then use like this:
<ul>
<li><tt>File#chown</tt> has type <tt>(UnixUserId, UnixUserId)-&gt;0</tt> &#8211; with <tt>UnixUserId</tt> being a pretend subclass of <tt>Integer</tt>, and <tt>0</tt> is literal value actually returned.</li>
</ul>
<p>To specify that something needs a particular methods just make up a pretend mixin like <tt>Meowable</tt> for <tt>#meow</tt>.</p>
<p>Any obvious extensions to this notation can be used, like this:
<ul>
<li><tt>Enumerable[A]#zip</tt> has type <tt>(Enumerable[B_1], *Enumerable[B_i])-&gt;Array[A, B_1, *B_i]</tt> &#8211; with intention that <tt>B_i</tt>s will be different for each argument understood from context. (I don&#8217;t think any static type system handles cases like this one reasonably &#8211; most require separate case for each supported tuple length, and you cannot use arrays if you mix types. Am I missing something?)</li>
</ul>
<p>
<h3>The End</h3>
<p>Well, what I really wanted to do what talk about Ruby collection system, and how 1.9 doesn&#8217;t go far enough in its attempts at fixing it. And without notation for types talking about high order functions that operate on collections quickly turns into a horrible mess. So I started with a brief explanation of notation I wanted to use, and then I figured out I can as well do it right and write something that will be reusable in other contexts too.</p>
<p>Most discussion of type systems concerns issues like safety and flexibility, which don&#8217;t concern me at all, and limit themselves to type systems usable by machines.</p>
<p>I need types for something else &#8211; as statements about data flow. Type signature like <tt>Enumerable[A]#()&amp;(A-&gt;B)-&gt;Hash[A,B]</tt> doesn&#8217;t tell you exactly what such function does but narrows set of possibilities extremely quickly. What it describes is a function which iterates over collection in order while building a <tt>Hash</tt> to be returned, using collection&#8217;s elements as keys, and values returned by the block as values. Can you guess the function I was thinking about here?</p>
<p>Now a type like that is not a complete specification &#8211; a function that returns an empty hash fits it. As does one which skips every 5th element. And one that only keeps entries with unique block results. And for that matter also one that sends your email password to NSA &#8211; at least assuming it returns that <tt>Hash</tt> afterwards.</p>
<p>It was still pretty useful. How about some of those?
<ul>
<li><tt>Hash[A,B] -&gt; Hash[B, Array[A]]</tt></li>
<li><tt>Hash[A,B] &amp;(A,B-&gt;C) -&gt; Hash[A,C]</tt></li>
<li><tt>Hash[A, Hash[B,C]] -&gt; Hash[[A,B], C]</tt></li>
<li><tt>Hash[A,B] &amp;(A,B-&gt;C) -&gt; Hash[C, Hash[A,B]]</tt></li>
<li><tt>Enumerable[Hash[A,B]] &amp;(A,B,B-&gt;B) -&gt; Hash[A,B]</tt></li>
<li><tt>Hash[A,Set[B]] -&gt; Hash[Set[A], Set[B]]</tt></li>
</ul>
<p>Even these short snippets should give a pretty good idea what these are all about.</p>
<p>That&#8217;s it for now. Hopefully it won&#8217;t be long until that promised 1.9 collections post.
<div class="blogger-post-footer"><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/27488238-2058657459732461917?l=t-a-w.blogspot.com" alt="" /></div>
<p></a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/tomasz-wegrzanowski-we-need-syntax-for-talking-about-ruby-types/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Wayne Beaton: Eclipse is? Open Source Projects</title>
		<link>http://guyub.co.id/wayne-beaton-eclipse-is-open-source-projects/</link>
		<comments>http://guyub.co.id/wayne-beaton-eclipse-is-open-source-projects/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 12:33:16 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Aplikasi]]></category>
		<category><![CDATA[F/OSS]]></category>
		<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[eclipse]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[project]]></category>

		<guid isPermaLink="false">http://guyub.co.id/wayne-beaton-eclipse-is-open-source-projects/</guid>
		<description><![CDATA[
One of the great things about Eclipse is that?unlike the celestial event and the unfortunately-named movie?everybody gets to see it; regardless of your location on earth, you have access to Eclipse.

But, like Linus, some people are confused as to the nature of Eclipse. To try and help people better understand Eclipse, I?ve created a ?What [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://dev.eclipse.org/blogs/wayne/2010/07/14/eclipse-is-open-source-projects/">
<p>One of the great things about <a href="http://www.eclipse.org">Eclipse</a> is that?unlike the celestial event and the unfortunately-named movie?everybody gets to see it; regardless of your location on earth, you have access to Eclipse.</p>
<p><a href="http://comics.com/peanuts/2010-07-14/" title="Peanuts"><img alt="Peanuts" border="0" src="http://c0389161.cdn.cloudfiles.rackspacecloud.com/dyn/str_strip/325226.full.gif" /></a></p>
<p>But, like Linus, some people are confused as to the nature of Eclipse. To try and help people better understand Eclipse, I?ve created a ?<a href="http://www.eclipse.org/resources/resource.php?id=420">What is Eclipse?</a>? talk that takes an audience step-by-step from what is commonly understood through a voyage of discovery of the true greatness of Eclipse. More specifically, I start by introducing <a href="http://dev.eclipse.org/blogs/wayne/2010/06/16/eclipse-is-a-java-ide/">Eclipse as a Java IDE</a>. This is generally easy for the sorts of audiences that I speak with to understand: folks in the software industry understand IDEs (though there are still a few emacs hermits out there; and I mean ?hermit? in a wholly-endearing way). I spend the next couple of slides broadening the technical horizon by introducing Eclipse as a platform for <a href="http://dev.eclipse.org/blogs/wayne/2010/06/17/eclipse-is-an-ide-platform/">building IDEs</a>, <a href="http://dev.eclipse.org/blogs/wayne/2010/06/22/eclipse-is-a-tools-platform/">tools</a>, <a href="http://dev.eclipse.org/blogs/wayne/2010/06/29/eclipse-is-an-application-framework/">desktop applications</a>, <a href="http://dev.eclipse.org/blogs/wayne/2010/07/13/eclipse-is-runtimes/">server applications and runtimes</a>, and more.</p>
<p>All this technology is wonderful. But technology is only part of the Eclipse not-so-secret sauce. All of that technology comes from the many open source projects at Eclipse.</p>
<p><a href="http://dev.eclipse.org/blogs/wayne/files/2010/07/eclipseisopensourceprojects.png"><img alt="" class="alignnone size-medium wp-image-1105" height="225" src="http://dev.eclipse.org/blogs/wayne/files/2010/07/eclipseisopensourceprojects-300x225.png" width="300" /></a></p>
<p>We have a lot of projects at Eclipse. <em><a href="http://eclipse.org/projects/listofprojects.php">A lot of projects</a>.</em> Up to this point in the presentation, most of the discussion has been around just a small handful of projects. The ?<a href="http://www.eclipse.org/eclipse">Eclipse</a>? Project is responsible for creating most of what people think of when they think of Eclipse. Specifically, the Eclipse Project creates what we try very hard to consistently refer to as the ?Eclipse SDK? (that is, a <em>software development kit</em> for building Eclipse-based applications). The Eclipse Project leverages the work of several other projects (Equinox comes immediately to mind) to provide important bits of information, but most of the bits that people think of when they think ?Eclipse is a Java IDE? comes from the Eclipse Project.</p>
<p>Now this is where things start to get a little weird. The Eclipse Project is what we call a ?Top-Level Project?. It is?effectively?a container for several smaller-scale projects. Each of these smaller scale projects, often referred to as simply ?projects? or ?subprojects?) is a distinct entity that contributes parts to the greater whole. The <a href="http://eclipse.org/platform/">Platform Project</a>, for example, produces the UI, workbench, and many other fundamental services and frameworks; the <a href="http://www.eclipse.org/jdt">Java development tools</a> (JDT) project produces the Java compiler, editors, debugger, and such; the <a href="http://www.eclipse.org/pde">Plugin-Development Environment</a> (PDE) produces tools to aid in the construction of plug-ins; and more. All these Projects have distinct development teams, web sites, and other resources.</p>
<p>The Eclipse Project is just one of the top-level projects at Eclipse. There are currently twelve top-level projects that organize dozens of projects. Top-level projects provide more than simple organization of projects. Each top-level project has a ?Project Management Committee? (PMC) that is responsible for providing oversight and guidance to the projects in their care. Each top-level project is a little different from the others, reflecting different values and technical areas. Some top-level projects tightly organize their projects; others allow greater levels of flexibility.</p>
<p>The fact of the matter is that we have a heck of a lot of projects at Eclipse. At last count we had more than 250 projects (I can hear you gasp at that number). The project is the finest-grained organizational unit at Eclipse. Each project has its own group of developers (called ?committers?), its own website, <a href="http://www.eclipse.org/forums">forums</a>, mailing lists, source code repositories, downloads and more. Some projects provide aggregations of other projects; a project can, for example, have subprojects of its own.</p>
<p><a href="http://dev.eclipse.org/blogs/wayne/files/2010/07/projectlayers.png"><img alt="" class="alignnone size-medium wp-image-1108" height="132" src="http://dev.eclipse.org/blogs/wayne/files/2010/07/projectlayers-300x132.png" width="300" /></a></p>
<p>It?s left to the project teams to decide how they want to organize. Typically, mid-level projects tend to be used to provide some hierarchical organization for related projects. Very often mid-level projects (and top-level projects in some cases) provide handy aggregate builds and downloads of the software produced by the projects they contain. The <a href="http://www.eclipse.org/webtools">Web Tools Platform</a> Project, much like the Eclipse Project, is a good example of this. Web Tools contains multiple separate projects (e.g. <a href="http://www.eclipse.org/webtools/dali">Dali</a> and <a href="http://www.eclipse.org/webtools/ejb/">EJB Tools</a>), but distributes downloads and updates under the top-level project. As an outsider-looking-in, Web Tools comes across as a single source of software (the fact that it is really multiple projects under the covers is a bit of an implementation detail).</p>
<p>So anyway? we have a lot of projects. They?re organized under top-level projects that provide oversight and guidance. Chances are very good that we have something going on at Eclipse that interests you. </p>
<p>But Eclipse is more than just technology and projects. Eclipse is? a Community.</p>
<p><a href="http://www.eclipsesummit.org/"><img border="0" height="60" src="http://www.eclipsecon.org/summiteurope2010/static/image/friends/480x60.jpg" width="480" /></a></p>
<p></a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/wayne-beaton-eclipse-is-open-source-projects/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Making ?Insert Ignore? Fast, by Avoiding Disk Seeks</title>
		<link>http://guyub.co.id/making-insert-ignore-fast-by-avoiding-disk-seeks/</link>
		<comments>http://guyub.co.id/making-insert-ignore-fast-by-avoiding-disk-seeks/#comments</comments>
		<pubDate>Tue, 06 Jul 2010 13:57:15 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Basisdata]]></category>
		<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Server, Jaringan & Keamanan]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://guyub.co.id/making-insert-ignore-fast-by-avoiding-disk-seeks/</guid>
		<description><![CDATA[
In my post from three weeks ago, I explained why the semantics of normal ad-hoc insertions with a primary key are expensive because they require disk seeks on large data sets. Towards the end of the post, I claimed that it would be better to use ?replace into? or ?insert ignore? over normal inserts, because [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://tokutek.com/2010/07/making-insert-ignore-fast-by-avoiding-disk-seeks/"><br />
In my post from three weeks ago, I explained why the semantics of normal ad-hoc insertions with a primary key are expensive because they require disk seeks on large data sets. Towards the end of the post, I claimed that it would be better to use ?replace into? or ?insert ignore? over normal inserts, because the semantics of these statements do NOT require disk seeks. In my post last week, I explained how the command ?replace into? can be fast with TokuDB&#8217;s fractal trees. Today, I explain how &#8220;insert ignore&#8221; can be fast, using a strategy that is very similar to what we do with &#8220;replace into&#8221;.</p>
<p>The semantics of &#8220;insert ignore&#8221; are similar to that of &#8220;replace into&#8221;:</p>
<p> if the primary (or unique) key does not exist: insert the new row<br />
 if the primary (or unique) key does exist: do nothing</p>
<p>B-trees have the same problem with &#8220;insert ignore&#8221; that they have with &#8220;replace into&#8221;. They perform a lookup of the primary key, incurring a disk seek. We have already shown how fractal trees do not incur this disk seek for &#8220;replace into&#8221;, so let&#8217;s see how we can avoid disk seeks with &#8220;insert ignore&#8221;.</p>
<p>The only difference with &#8220;replace into&#8221; is when the primary (or unique) key exists, instead of overwriting the old row with the new row, we disregard the new row. So, all we need to do is tweak our tombstone messaging scheme (that we use for deletes and &#8220;replace into&#8221;) so that when &#8220;insert ignore&#8221; commands do not overwrite old rows with new rows. Similar to deletes and replace into, with this scheme, &#8220;insert ignore? can be two orders of magnitude faster than insertions into a B-tree.</p>
<p>Here is what we do. We insert a message into the fractal tree, with a new message &#8220;ii&#8221;, to signify that we are doing an &#8220;insert ignore&#8221;. The only difference between this message and the normal &#8220;i&#8221; message for insertions is what we do on queries and merges. On queries, if the message is an &#8220;ii&#8221;, then the value in the LOWER node is read, and not the higher node. On merges, if the higher node has a message of &#8220;ii&#8221;, the value in the LOWER node takes precedence over the value in the higher node.</p>
<p>Let&#8217;s look at an example that is similar to what we looked at for &#8220;replace into&#8221;:</p>
<p>create table foo (a int, b int, primary key (a));</p>
<p>Suppose the fractal tree for this table looks as follows:</p>
<p>- </p>
<p>- -</p>
<p>- &#8211; - -</p>
<p>&#8230;.</p>
<p>(i (1,1)) (i (2,2)) (i (3,3)) (i (4,4)) &#8230; (i (1000,1000)) &#8230; (i (2^32, 2^32))</p>
<p>The ?i? stands for insertion message. Now suppose we do:</p>
<p>insert ignore into foo values (1000, 1001).</p>
<p>With fractal trees, we insert (ii (1000,1001)) into the top node. The tree then looks as such:</p>
<p>(ii (1000,1001)) </p>
<p>- -</p>
<p>- &#8211; - -</p>
<p>&#8230;.</p>
<p>(i (1,1)) (i (2,2)) (i (3,3)) (i (4,4)) &#8230; (i (2^32, 2^32))</p>
<p>So upon querying the key ?1000&#8242;, a cursor notices that (1000,1001) has a message of &#8220;ii&#8221;. If it finds another value for the key 1000 in a lower node, it reads that value, otherwise, it reads (1000,1001). Because (1000,1000) is located in a lower node, the cursor returns (1000,1000) to the user. On merges, the message in the lower node, (1000,1000) overwrites the message in the higher node, (1000,1001).</p>
<p>While &#8220;insert ignore&#8221; can be fast, there are caveats (indexes, triggers, replication), just as there are with &#8220;replace into&#8221;. In a future posting, I will get into some of them.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/making-insert-ignore-fast-by-avoiding-disk-seeks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CORE GRASP &#8211; PHP Tainted Mode</title>
		<link>http://guyub.co.id/core-grasp-php-tainted-mode/</link>
		<comments>http://guyub.co.id/core-grasp-php-tainted-mode/#comments</comments>
		<pubDate>Mon, 05 Jul 2010 20:11:16 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[core grasp]]></category>
		<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://guyub.co.id/core-grasp-php-tainted-mode/</guid>
		<description><![CDATA[
        
Core Security Technologies today announced the release of CORE GRASP, which is a patch against the PHP 5.2.3 code tree that adds a tainted mode to PHP to protect the mysql_query() function. Their implementation adds a tainted or not flag for every byte so that it is [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.php-security.org/archives/92-CORE-GRASP-PHP-Tainted-Mode.html"><br />
        <br />
<a href="http://www.coresecurity.com">Core Security Technologies</a> today announced the release of <a href="http://grasp.coresecurity.com/index.php?m=dld">CORE GRASP</a>, which is a patch against the PHP 5.2.3 code tree that adds a tainted mode to PHP to protect the <a href="http://www.php.net/mysql_query">mysql_query()</a> function. Their implementation adds a tainted or not flag for every byte so that it is possible on invocation of mysql_query() to determine any kind of injection.
<p>To add such a tainted mode to PHP has been discussed several times in the past. It was rejected for several reasons like the obvious huge speed impact and the danger of false positives and a false sense of security. And indeed the way CORE GRASP is implemented it looks like a huge memory and speed overhead that should be tested. In addition to that their query parser will for example wrongly detect quotes escaped by doubling as injection attack.</p>
<p>Aside from this there are several other possible problems in the code like a remote one byte stack overflow (that seems harmless due to memory alignment), wrong handling of the _SERVER superglobal in case of JIT and it also seems that control characters like linebreaks can be injected into the logfiles. Further analysis and a deeper look into the code is needed.</p>
<p>However it has to be taken into account that this is the very first public version of CORE GRASP, so maybe all these problems are gone soon and support for further database engines is added. </p>
<p>    </a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/core-grasp-php-tainted-mode/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>About the CSRF Redirector</title>
		<link>http://guyub.co.id/about-the-csrf-redirector/</link>
		<comments>http://guyub.co.id/about-the-csrf-redirector/#comments</comments>
		<pubDate>Mon, 05 Jul 2010 20:11:16 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Pemrograman]]></category>
		<category><![CDATA[Sindikasi]]></category>
		<category><![CDATA[csrf]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[redirector]]></category>

		<guid isPermaLink="false">http://guyub.co.id/about-the-csrf-redirector/</guid>
		<description><![CDATA[
        
You might have seen this post in Chris blog about a CSRF redirector he did. This is basically nothing more than a little script that turns a GET request into a hidden formular that is then posted via JavaScript. There have always been security issues with redirector [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.php-security.org/archives/89-About-the-CSRF-Redirector.html"><br />
        <br />
You might have seen <a href="http://shiflett.org/blog/2007/jul/csrf-redirector">this post</a> in Chris blog about a CSRF redirector he did. This is basically nothing more than a little script that turns a GET request into a hidden formular that is then posted via JavaScript. There have always been security issues with redirector scripts, and if you provide one open to anyone, you should care about what kind of redirects you actually allow.</p>
<p>Two major risks happen to exists with chris example:
</p>
<ol>
<li>Malicious people could misuse them as bouncers to attack other sites
</li>
<li>Not every URL is a web page. Some can load plugins, display information and<br />
some can execute JavaScript.</p>
</li>
</ol>
<p>Here is an example URL:
</p>
<p><a href="http://shiflett.org/csrf.php?csrf=+++%6aavascript:alert(/I_AM_A_SECURITY_EXPERT/)">http://shiflett.org/csrf.php?csrf=javascript:alert(/I_AM_A_SECURITY_EXPERT/)</a>
</p>
<p>In Internet Explorer (and Safari) this will give you access to the domain (cookies, etc&#8230;). In Firefox you can still do other funny things.</p>
<p>So if you implement (javascript) redirector scripts, make sure you do a proper<br />
whitelisting of the user delivered urls.</p>
<p><b>UPDATE:</b> The above example for a simple XSS does no longer work. However there are still other XSS vulnerabilities like variable-width problems in the CSRF redirector and it is still an open bouncer for malicious persons.</p>
<p>    </a></p>
]]></content:encoded>
			<wfw:commentRss>http://guyub.co.id/about-the-csrf-redirector/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
