Pencarian

Rss Posts

 

 

 

Berita pada kategori ‘Sindikasi’

Roy Ganor: if-ify, for-ify, foreach-ify and func-ify

Feb 19, 2012

A common question?among developers is how to surround the current selection in the editor with a parent statement. For example if you start coding a statement that validates an expression and then want to iterate over an array of expressions and do the same to all its elements.
For this end Eclipse?templates provide two variables “line_selection” and “word_selection” that help you build custom selection-based wrappers.
In this example three new templates were added to simplify this scenario:

You can import these templates (xml below) to your IDE via Preferences > PHP > Editor > Templates > Import…

MariaDB 5.3.4 benchmarks

Feb 19, 2012

MariaDB 5.3 has reached the release candidate milestone, and the 5.3 version promises a lot of new features and optimization (i.e in optimizer http://kb.askmonty.org/en/what-is-mariadb-53#query-optimizer). No surprise I wanted to check how all improvements affect general performance.
So I why don’t we run old good sysbench benchmark.

For the benchmark I took:

HP ProLiant DL380 G6 box
sysbench multitables oltp rw workload, 16 tables, 500mil rows each, total datasize about 30GB
working threads from 1 to 256
Versions: MariaDB 5.3.4, MySQL 5.5.20
Data is stored on RAID10 HDD partition
Like in all my recent benchmarks, I make throughput measurements each 10 sec, so we can see the stability of the throughout

The raw results, configuration and scripts are available on our Benchmarks Launchpad
The graphical results:
Throughput (more is better)

Threads MariaDB 5.3.4 MySQL 5.5.20 Ratio
1 252 271 0.9298893
2 412 588 0.7006803
4 801 1097 0.7301732
8 1709 2205 0.7750567
16 3197 4076 0.7843474
32 3303 4166 0.7928469
64 3336 4150 0.8038554
128 3800 4170 0.9112710
256 3710 4131 0.8980876

I was surprised to see that MariaDB shows 20-30% worse throughput.
It seems many changes resulted to performance hit in general. I wonder whether MariaDB team runs performance regression benchmarks, and if they do, why do we see such performance decline.
Follow @VadimTk

Can we improve MySQL variable handling ?

Feb 19, 2012

MySQL Settings (also known as Server Variables) have interesting property. When you set variable in running server this change is not persisted in any way and server will be back to old value upon restart. MySQL also does not have option to re-read config file without restarting as some other software so approach to change config file and when instruct server to re-read it also does not work. This leads to runtime settings being different from settings set in config file, and unexpected change on restart a frequent problem.

pt-config-diff is the tool which can help with this problem a lot, being able to compare settings in my.cnf to those server is currently running with. The problem however this only works well if settings are set in my.cnf as if default option was used and we change it in run time we can’t detect such change easily because MySQL Server does not seems to have an easy way to check what was the default value for given Server Variable.
The only way I’m aware about is running the server from command line with –no-defaults –verbose –help options:

pz@ubuntu:~$ /usr/sbin/mysqld –no-defaults –verbose –help

timed-mutexes FALSE
tmp-table-size 16777216
tmpdir /tmp
transaction-alloc-block-size 8192
transaction-isolation REPEATABLE-READ
transaction-prealloc-size 4096
updatable-views-with-limit YES
userstat FALSE
verbose TRUE
wait-timeout 28800

Which is however rather ugly and only works with shell access to the server which is not always the case.
Interesting enough MySQL Allows you to SET variable to default value (compile time default, not the one server was started with) yet there seems not to be a way to read it:

mysql> set global sort_buffer_size=DEFAULT;
Query OK, 0 rows affected (0.00 sec)

mysql> select @@global.sort_buffer_size;
+—————————+
| @@global.sort_buffer_size |
+—————————+
| 2097152 |
+—————————+
1 row in set (0.00 sec)

This could be used as technique to detect the value for DEFAULT variables for SESSION variables, yet for some GLOBAL variables setting them back and forth would not be safe.
The simple change which would make dealing with MySQL variables in automated way a lot more convenient would be extending INFORMATION_SCHEMA.GLOBAL_VARIABLES Currently as of MySQL 5.5 it contains only variable name and value. Yet I would suggest adding few more columns such as DEFAULT – to hold compile time default value for variable and STARTUP to hold the value the server was started with.
It also might be good idea to extend SELECT syntax to ease querying of variable global value Right now I can select:

mysql> select @@global.sort_buffer_size;
+—————————+
| @@global.sort_buffer_size |
+—————————+
| 2097152 |
+—————————+
1 row in set (0.00 sec)

If I could only refer to “default” or “startup” in addition to “global” and “session” prefixes which are available now it would be quite nice.

The benefit of keeping the InnoDB transaction log in cache

Feb 19, 2012

I was getting inconsistent performance results while running sysbench to generate a workload of point-updates and point-lookups. The rate of rows updated per second would vary between 200 and 600 and the variance appeared to be random. From PMP there was a lot of contention on the transaction log system mutex. It took me about one day to guess at the root cause — the InnoDB transaction log was not in the OS buffer cache and 512-byte aligned log writes frequently required disk reads to get a 4kb aligned page into the cache to apply the write. The problem is avoided when either the transaction log remains in the OS buffer cache or you use the Percona patch that adds all_o_direct as an option for innodb_flush_method.?The problem was harder to debug than it should be because InnoDB didn’t report log write latency via a separate metric. All synchronous writes, log and doublewrite buffer, were reported via the “Sync writes” line and that combines large/slow writes to the doublewrite buffer with small/fast writes to the transaction log:Sync writes: 4836209719 requests, 0 old, 5009.82 bytes/r, svc: 498087.30 secs, 0.10 msecs/r?I have a diff out to fix that:Log writes: 2328355 requests, 0 old, 955.39 bytes/r, svc: 14.44 secs, 0.01 msecs/rDoublewrite buffer writes: 19718 requests, 12 old, 946736.45 bytes/r, svc: 41.29 secs, 2.09 msecs/r?I suspect this is even harder to debug in official MySQL which doesn’t have any of the metrics above in SHOW INNODB STATUS output or SHOW STATUS counters. Perhaps the performance schema makes this easier to debug but I don’t know much about that feature.?I then reproduced the problem by starting the benchmark with the transaction log in the OS buffer cache and then running?echo 1 > /proc/sys/vm/drop_caches to remove it from cache. The results below show the impact on both the rate of rows read and updated per second. The test used 128 client threads doing point-lookups and 32 client threads doing point-updates. The database was 240G on disk and the InnoDB buffer cache was 30G. The rates drop significantly when the cache is dropped.??The results are great prior to removing the transaction log from cache. On a server with 8 10k RPM SAS disks I was able to get 2800 point-lookups and 500 point-updates per second. Using 16kb InnoDB pages the server sustained 2500 page reads/second from disk and 500 page writes/second to disk.??The final graph uses logscale for the y-axis to plot the rate of rows updated/second and the average latency for a log write in microseconds. The update rate drops when the log write latency spikes to more than 10ms per write. It was less than 10us prior to that.?

NoSQL performance numbers – MySQL and Redis

Feb 18, 2012

Links to performance numbers posted wrt various NoSQL solutions:
A top 20 global website announced they have migrated from MySQL to Redis. There will be a keynote and everything. It doesn’t say how big the Redis Cluster is, but they serve 100M pages / day, and clock 300k Redis queries / second.

https://groups.google.com/forum/?fromgroups#!topic/redis-db/d4QcWV0p-YM

Btw, they mention that MySQL remains as the master data store from which the Redis indexes are generated.
(The reason I don’t mention the name of this Redis user is simply I feat my mom is sometimes reading my blog…)
read more

The Mysterious Phantom Reference

Sep 29, 2011

I talk about java.lang.ref.* in my performance tuning course because these things (along with anything that implements finalize) are more expensive to create than normal objects and require at least two rounds of GC before you’re completely rid of them. Of the bunch, that includes Reference, WeakReference, SoftReference and two other private reference classes that get mixed up with finalization, PhantomReference has to be the strangest.

 

Though I talk about PhantomReference, I’ve never used to nor could I even think of a situation where I’d even think to use it! Which is, I guess, why I’ve never used it. In fact, I can’t remember ever seeing them used any where in the wild. But, as is the case with just about everything,  if you talk to enough people you will eventually run into someone that has tried to used even the most arcane features in Java. True to this point, this week, I finally ran into someone that had a convincing use case for PhantomReference.

 

This person was trying to track down who was leaking JDBC connections. His idea was to wrap the connection in a PhantomReference and when the connection was discarded (and not closed), the garbage collector would put the object into a supplied ReferenceQueue, you’d grab the object from the ReferenceQueue and not only close it, you’d try to sort out who should have closed it. As with just about everything, the devil is in the details and in this case the details are; reference queue doesn’t return the object wrapped in the PhantomReference, it returns the PhantomReference and PhantomReference.get() always returns null. The reason for this is that the wrapped object is half collected and should not be reconnected to anything. I get that but what I don’t get it given this condition, why is PhantomReference marketed as alternative mechanisum to finalization. WIth no means to access the wrapped object, (reflection aside) there isn’t much one can do to clean things up. Consider the following code.

 

public class Foo {

 

    private String bar;

 

    public Foo(String bar) {

        this.bar = bar;

    }

 

    public String foo() {

        return bar;

    }

}

 

So lets say after the object has been completely dereferenced by the application I want to some how call foo(). Here is some code that I expected to work that would do this with one niggle.

 

        

 

// initialize

ReferenceQueue<Foo> queue = new ReferenceQueue<Foo>();

ArrayList< PhantomReference<Foo>> list=new ArrayList<PhantomReference<Foo>>();

 

for ( int i = 0; i < 10; i++) {

    Foo o = new Foo( Integer.toOctalString( i));

    list.add(new PhantomReference<Foo>(o, queue));

}

 

// make sure the garbage collector does it’s magic

System.gc();

 

// lets see what we’ve got

Reference<? extends Foo> referenceFromQueue;

for ( PhantomReference<Foo> reference : list)

    System.out.println(reference.isEnqueued());

 

while ( (referenceFromQueue = queue.poll()) != null) {

    System.out.println(referenceFromQueue.get());

    referenceFromQueue.clear();

}

 

PhantomReference takes an instance of Foo and a ReferenceQueue. Since no handles are kept to Foo, it should immediately be dead. Next, tell the VM to collect as there isn’t enough in heap for it to trigger a collection naturally. The first thing I’m going to ask the PhantomReference is; have you been enqueued. In this case the answer will be true. Next I ask the queue for the reference but as you can see, calling get() always returns null.

 

About the only solution that made sense is to wrap the resources or objects you wanted to interact with in a subclass of PhantomReference.

 

public class FinalizeStuff<Foo> extends PhantomReference<Foo> {

 

    public FinalizeStuff(Foo foo, ReferenceQueue<? super Foo> queue) {

        super(foo, queue);

    }

 

    public void bar() {

        System.out.println("foobar is finalizing resources");

    }

}

 

In this case I’m not going to wrap Foo in the subclass as that would seem to violate the spirit of PhantomReference. Instead I’m going to wrap resources associated with Foo and interact with them. Now I can do this.

 

// initialize

ReferenceQueue<Foo> queue = new ReferenceQueue<Foo>();

ArrayList< FinalizeStuff<Foo>> list = new ArrayList<FinalizeStuff<Foo>>();

ArrayList<Foo> foobar = new ArrayList<Foo>();

 

for ( int i = 0; i < 10; i++) {

    Foo o = new Foo( Integer.toOctalString( i));

    foobar.add(o);

    list.add(new FinalizeStuff<Foo>(o, queue));

}

 

// release all references to Foo and make sure the garbage collector does it’s magic

foobar = null;

System.gc();

 

// should be enqueued

Reference<? extends Foo> referenceFromQueue;

for ( PhantomReference<Foo> reference : list) {

    System.out.println(reference.isEnqueued());

}

 

// now we can call bar to do what ever it is we need done

while ( (referenceFromQueue = queue.poll()) != null) {

    ((FinalizeStuff)referenceFromQueue).bar();

    referenceFromQueue.clear();

}

 

This works though in some variations of this implementation main thread was racing against another thread (GC is my best guess). Note the strange need to cast. I could not sort out how to avoid it so if someone wants to comment….. Returning to the use case, the subclass that was created for the JDBC leak captured a stacktrace which was logged when the PhantomReference was pulled from the reference queue.

 

I’m happy for comments from anyone that has actually found a good use for PhantomReference as quite frankly, I still don’t understand why anyone would use it in leu of finalization. While finalization isn’t perfect, it’s not the dog that everyone makes it out to be and it’s far safer to use than PhantomReference is. For example, if anything bad happens during finalization, you’ll only shoot down a helper thread. Furthermore, the only way finalize will not be called is if the VM fails catastrophically or you’ve requested a shutdown but have failed to specify that finalizers (RunTime.runFinalizersOnExit()) should run before doing so. No such guarantees exist for PhantomReference.

 

PS, One point in favor of PhantomReference over finalize is that you could configure your system to optionally wrap objects. But then, you could use an interface with two implementations, one that implemented finalize() and one that doesn’t to achieve the same effect. So, I’m still scratching my head over this one.

MariaDB: the new MySQL? Interview with Michael Monty Widenius.

Sep 29, 2011

?I want to ensure that the MySQL code base (under the name of MariaDB) will survive as open source, in spite of what Oracle may do.? — Michael ?Monty? Widenius. Michael ?Monty? Widenius is the main author of the original version of the open-source MySQL database and a founding member of the MySQL AB company. [...]

InnoDB at Oracle OpenWorld

Sep 28, 2011

Sunny and I will be presenting at the Oracle OpenWorld next week:

Introduction to InnoDB, MySQL’s Default Storage Engine,? 10/04/11 Tuesday 01:15 PM, ? Marriott Marquis – Golden Gate C3, ? ? Calvin Sun
InnoDB Performance Tuning,? 10/04/11 Tuesday? 03:30 PM, ? Marriott Marquis – Golden Gate C2, ? Sunny Bains

The first session is for beginners, who are new to InnoDB and MySQL. The second session will cover many new performance features in MySQL 5.5 and 5.6, and share some tuning tips to maximize MySQL performance.
What to learn more about MySQL? There will be something for everyone. Come to join us!

 

JavaScript: Asynchronous Script Loading and Lazy Loading – Federico Cargnelutti

Jul 12, 2011

Most of the time remote scripts are included at the end of an html document, right before the closing body tag. This is because browsers are single threaded and when they encounter a script tag, they halt any other processes until they download and parse the script. By including scripts at the end, you allow the browser to download and render all page elements, style sheets and images without any unnecessary delay. Also, if the browser renders the page before executing any script, you know that all page elements are already available to retrieve.

However, websites like Facebook for example, use a more advanced technique. They include scripts dynamically via DOM methods. This technique, which I?ll briefly explain here, is known as ?Asynchronous Script Loading?.

Lets take a look at the script that Facebook uses to download its JS library:

(function () {
    var e = document.createElement('script');
    e.src = 'http://connect.facebook.net/en_US/all.js';
    e.async = true;
    document.getElementById('fb-root').appendChild(e);
}());

When you dynamically append a script to a page, the browser does not halt other processes, so it continues rendering page elements and downloading resources. The best place to put this code is right after the opening body tag. This allows Facebook initialization to happen in parallel with the initialization on the rest of the page.

Facebook also makes non-blocking loading of the script easy to use by providing the fbAsyncInit hook. If this global function is defined, it will be executed when the library is loaded.

window.fbAsyncInit = function () {
    FB.init({
        appId: 'YOUR APP ID',
        status: true,
        cookie: true,
        xfbml: true
    });
};

Once the library has loaded, Facebook checks the value of window.fbAsyncInit.hasRun and if it?s false it makes a call to the fbAsyncInit function:

if (window.fbAsyncInit && !window.fbAsyncInit.hasRun) {
    window.fbAsyncInit.hasRun = true;
    fbAsyncInit();
}

Now, what if you want to load multiple files asynchronously, or you need to include a small amount of code at page load and then download other scripts only when needed? Loading scripts on demand is called ?Lazy Loading?. There are many libraries that exist specifically for this purpose, however, you only need a few lines of JavaScript to do this.

Here is an example:

$L = function (c, d) {
    for (var b = c.length, e = b, f = function () {
            if (!(this.readyState
            		&& this.readyState !== "complete"
            		&& this.readyState !== "loaded")) {
                this.onload = this.onreadystatechange = null;
                --e || d()
            }
        }, g = document.getElementsByTagName("head")[0], i = function (h) {
            var a = document.createElement("script");
            a.async = true;
            a.src = h;
            a.onload = a.onreadystatechange = f;
            g.appendChild(a)
        }; b;) i(c[--b])
};

The best place to put this code is inside the head tag. You can then use the $L function to asynchronously load your scripts on demand. $L takes two arguments: an array (c) and a callback function (d).

var scripts = [];
scripts[0] = 'http://www.google-analytics.com/ga.js';
scripts[1] = 'http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.js';

$L(scripts, function () {
    console.log("ga and jquery scripts loaded");
});

$L(['http://connect.facebook.net/en_US/all.js'], function () {
    console.log("facebook script loaded");
    window.fbAsyncInit.hasRun = true;
    FB.init({
        appId: 'YOUR APP ID',
        status: true,
        cookie: true,
        xfbml: true
    });
});

You can see this script in action here (right click -> view page source).


Filed under: Design Patterns, Programming, Software Architecture the original (another 826 bytes)

PHP Returning Numeric Values in JSON – Lorna Mitchell

Jul 12, 2011


When I wrote about launching a prototype of a new
joind.in API, quite a few people started to try it out. My friend David Soria Parra emailed me to point out that many of the numbers in the API were being returned as strings. He said:

It’s just a standard problem of PHP REST services. When I try to access it with java I have to convert it over and over again to ints.

I did have a quick look at the PHP manual page for json_encode but I didn’t see anything mentioning this. A few weeks later (my inbox is a black hole and it takes a while to process these things) I fell over a throwaway comment to an undocumented constant JSON_NUMERIC_CHECK, and I added the constant name to my todo list. In the time it took for me to actually get around to googling for this, some wonderful person updated the PHP manual page (this is why I love PHP) to include it as a documented option, and someone else had added a user contributed note about using it.

It turns out, this constant does exactly what I need. Here’s a simple use case:

echo json_encode(array(‘event_id’ => ‘603′));
echo json_encode(array(‘event_id’ => ‘603′), JSON_NUMERIC_CHECK);
?

and the output:

{"event_id":"603"}
{"event_id":603}

There are probably some situations in which you don’t want all your looks-like-a-number data to be returned as a number, but for now it seems to be a good fit for api.joind.in.