tag:blogger.com,1999:blog-42756320071652207042024-02-19T18:08:27.483+10:30AppEngine DevelopmentEmlyn's Professional AppEngine BlogAnonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-4275632007165220704.post-31084118756839009792013-01-06T15:07:00.000+10:302014-02-10T16:03:04.195+10:30gaedocstore: JSON Document Database Layer for ndb<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiocLqkZHUpuM8yenvdKASZHO7UPA6Ge2CjH2XdY_wt1gFSZ7qQIGBE6crVWSR9gXP63BB2gJXhnCcLFRQhFoFf7Fx_nbYSOMYhyphenhyphenkQpDYWGLi7O_O-5Vl88MzKDc2Cx7B0aeGiY-QgX3eA/s1600/jsonlogo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiocLqkZHUpuM8yenvdKASZHO7UPA6Ge2CjH2XdY_wt1gFSZ7qQIGBE6crVWSR9gXP63BB2gJXhnCcLFRQhFoFf7Fx_nbYSOMYhyphenhyphenkQpDYWGLi7O_O-5Vl88MzKDc2Cx7B0aeGiY-QgX3eA/s1600/jsonlogo.jpg" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
In my professional life I'm working on a server side appengine based system whose next iteration needs to be really good at dealing with schema-less data; JSON objects, in practical terms.
To that end I've thrown together a simple document database layer to sit on top of appengine's ndb, in python.<br />
<br />
Here's the github repo: <a href="https://github.com/emlynoregan/gaedocstore">https://github.com/emlynoregan/gaedocstore</a><br />
<br />
And here's the doco as it currently exists in the repo, it should explain what I'm up to.<br />
<br />
This library will no doubt change as begins to be used in earnest.<br />
<h1 style="-webkit-font-smoothing: antialiased; border: 0px; cursor: text; font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 28px; margin: 0px 0px 10px; padding: 0px; position: relative;">
</h1>
<h1 style="-webkit-font-smoothing: antialiased; border: 0px; cursor: text; font-size: 28px; margin: 0px 0px 10px; padding: 0px; position: relative;">
<br class="Apple-interchange-newline" />gaedocstore</h1>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
<em style="border: 0px; margin: 0px; padding: 0px;">gaedocstore is MIT licensed </em><a href="http://opensource.org/licenses/MIT" style="border: 0px; color: #4183c4; margin: 0px; padding: 0px; text-decoration: initial;">http://opensource.org/licenses/MI</a><a href="http://opensource.org/licenses/MIT" style="border: 0px; color: #4183c4; margin: 0px; padding: 0px; text-decoration: initial;">T</a></div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
gaedocstore is a lightweight document database implementation that sits on top of ndb in google appengine.</div>
<h2 style="-webkit-font-smoothing: antialiased; border-bottom-color: rgb(204, 204, 204); border-bottom-style: solid; border-width: 0px 0px 1px; cursor: text; font-size: 24px; margin: 20px 0px 10px; padding: 0px; position: relative;">
<a class="anchor" href="https://github.com/emlynoregan/gaedocstore/edit/master/README.md#introduction" name="introduction" style="border: 0px; bottom: 0px; color: #4183c4; cursor: pointer; display: block; left: 0px; margin: 0px 0px 0px -30px; padding: 0px 0px 0px 30px; position: absolute; text-decoration: initial; top: 0px;"></a>Introduction</h2>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
If you are using appengine for your platform, but you need to store arbitrary (data defined) entities, rather than pre-defined schema based entities, then gaedocstore can help.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
gaedocstore takes arbitrary JSON object structures, and stores them to a single ndb datastore object called GDSDocument.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
In ndb, JSON can simply be stored in a JSON property. Unfortunately that is a blob, and so unindexed. This library stores the bulk of the document in first class expando properties, which are indexed, and only resorts to JSON blobs where it can't be helped (and where you are unlikely to want to search anyway).</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
gaedocstore also provides a method for denormalised linking of objects; that is, inserting one document into another based on a reference key, and keeping the inserted, denormalised copy up to date as the source document changes. Amongst other uses, this allows you to provide performant REST apis in which objects are decorated with related information, without the penalty of secondary lookups.</div>
<h2 style="-webkit-font-smoothing: antialiased; border-bottom-color: rgb(204, 204, 204); border-bottom-style: solid; border-width: 0px 0px 1px; cursor: text; font-size: 24px; margin: 20px 0px 10px; padding: 0px; position: relative;">
<a class="anchor" href="https://github.com/emlynoregan/gaedocstore/edit/master/README.md#simple-put" name="simple-put" style="border: 0px; bottom: 0px; color: #4183c4; cursor: pointer; display: block; left: 0px; margin: 0px 0px 0px -30px; padding: 0px 0px 0px 30px; position: absolute; text-decoration: initial; top: 0px;"></a>Simple Put</h2>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
When JSON is stored to the document store, it is converted to a GDSDocument object (an Expando model subclass) as follows:</div>
<ul style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin: 15px 0px; padding: 0px 0px 0px 30px;">
<li style="border: 0px; margin: 0px; padding: 0px;"><div style="border: 0px; padding: 0px;">
Say we are storing an object called Input.</div>
</li>
<li style="border: 0px; margin: 0px; padding: 0px;"><div style="border: 0px; padding: 0px;">
Input must be a dictionary.</div>
</li>
<li style="border: 0px; margin: 0px; padding: 0px;"><div style="border: 0px; padding: 0px;">
Input must include a key at minimum. If no key is provided, the put is rejected.</div>
<ul style="border: 0px; margin: 0px; padding: 0px 0px 0px 30px;">
<li style="border: 0px; margin: 0px; padding: 0px;">If the key already exists for a GDSDocument, then that object is updated using the new JSON.</li>
<li style="border: 0px; margin: 0px; padding: 0px;">With an update, you can indicate "Replace" or "Update" (default is Replace). Replace entirely replaces the existing entity. "Update" merges the entity with the existing stored entity, preferentially including information from the new JSON.</li>
<li style="border: 0px; margin: 0px; padding: 0px;">If the key doesn't already exist, then a new GDSDocument is created for that key.</li>
</ul>
</li>
<li style="border: 0px; margin: 0px; padding: 0px;"><div style="border: 0px; padding: 0px;">
The top level dict is mapped to the GDSDocument (which is an expando).</div>
</li>
<li style="border: 0px; margin: 0px; padding: 0px;"><div style="border: 0px; padding: 0px;">
The GDSDocument property structure is built recursively to match the JSON object structure.</div>
<ul style="border: 0px; margin: 0px; padding: 0px 0px 0px 30px;">
<li style="border: 0px; margin: 0px; padding: 0px;">Simple values become simple property values</li>
<li style="border: 0px; margin: 0px; padding: 0px;">Arrays of simple values become a repeated GenericProperty. ie: you can search on the contents.</li>
<li style="border: 0px; margin: 0px; padding: 0px;">Arrays which include dicts or arrays become JSON in a GDSJson object, which just hold "json", a JsonProperty (nothing inside is indexed, or searchable)</li>
<li style="border: 0px; margin: 0px; padding: 0px;">Dictionaries become another GDSDocument</li>
<li style="border: 0px; margin: 0px; padding: 0px;">So nested dictionary fields are fully indexed and searchable, including where their values are lists of simple types, but anything inside a complex array is not.</li>
</ul>
</li>
</ul>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
eg:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">ldictPerson = {
"key": "897654",
"type": "Person",
"name": "Fred",
"address":
{
"addr1": "1 thing st",
"city": "stuffville",
"zipcode": 54321,
"tags": ['some', 'tags']
}
}
lperson = GDSDocument.ConstructFromDict(ldictPerson)
lperson.put()
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
This will create a new person. If a GDSDocument with key "897654" already existed then this will overwrite it. If you'd like to instead merge over the top of an existing GDSDocument, you can use aReplace = False, eg:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;"> lperson = GDSDocument.ConstructFromDict(lperson, aReplace = False)
</code></pre>
<h2 style="-webkit-font-smoothing: antialiased; border-bottom-color: rgb(204, 204, 204); border-bottom-style: solid; border-width: 0px 0px 1px; cursor: text; font-size: 24px; margin: 20px 0px 10px; padding: 0px; position: relative;">
<a class="anchor" href="https://github.com/emlynoregan/gaedocstore/edit/master/README.md#simple-get" name="simple-get" style="border: 0px; bottom: 0px; color: #4183c4; cursor: pointer; display: block; left: 0px; margin: 0px 0px 0px -30px; padding: 0px 0px 0px 30px; position: absolute; text-decoration: initial; top: 0px;"></a>Simple Get</h2>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
All GDSDocument objects have a top level key. Normal ndb.get is used to get objects by their key.</div>
<h2 style="-webkit-font-smoothing: antialiased; border-bottom-color: rgb(204, 204, 204); border-bottom-style: solid; border-width: 0px 0px 1px; cursor: text; font-size: 24px; margin: 20px 0px 10px; padding: 0px; position: relative;">
<a class="anchor" href="https://github.com/emlynoregan/gaedocstore/edit/master/README.md#querying" name="querying" style="border: 0px; bottom: 0px; color: #4183c4; cursor: pointer; display: block; left: 0px; margin: 0px 0px 0px -30px; padding: 0px 0px 0px 30px; position: absolute; text-decoration: initial; top: 0px;"></a>Querying</h2>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
Normal ndb querying can be used on the GDSDocument entities. It is recommended that different types of data (eg Person, Address) are denoted using a top level attribute "type". This is only a recommended convention however, and is in no way required.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
You can query on properties in the GDSDocument, ie: properties from the original JSON.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Querying based on properties in nested dictionaries is fully supported.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
eg: Say I store the following JSON:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "897654",
"type": "Person",
"name": "Fred",
"address":
{
"key": "1234567",
"type": "Address",
"addr1": "1 thing st",
"city": "stuffville",
"zipcode": 54321
}
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
A query that would return potentially multiple objects including this one is:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">GDSDocument.gql("WHERE address.zipcode = 54321").fetch()
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
or</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">s = GenericProperty()
s._name = 'address.zipcode'
GDSDocument.query(s == 54321).fetch()
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Note that if you are querying on properties below the top level, you cannot do the more standard</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">GDSDocument.query(GenericProperty('address.zipcode') == 54321).fetch() # fails
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
due to a <a href="http://stackoverflow.com/questions/13631884/ndb-querying-a-genericproperty-in-repeated-expando-structuredproperty" style="border: 0px; color: #4183c4; margin: 0px; padding: 0px; text-decoration: initial;">limitation of ndb</a></div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
If you need to get the json back from a GDSDocument, just do this:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">json = lgdsDocument.to_dict()
</code></pre>
<h2 style="-webkit-font-smoothing: antialiased; border-bottom-color: rgb(204, 204, 204); border-bottom-style: solid; border-width: 0px 0px 1px; cursor: text; font-size: 24px; margin: 20px 0px 10px; padding: 0px; position: relative;">
<a class="anchor" href="https://github.com/emlynoregan/gaedocstore/edit/master/README.md#denormalized-object-linking" name="denormalized-object-linking" style="border: 0px; bottom: 0px; color: #4183c4; cursor: pointer; display: block; left: 0px; margin: 0px 0px 0px -30px; padding: 0px 0px 0px 30px; position: absolute; text-decoration: initial; top: 0px;"></a>Denormalized Object Linking</h2>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
You can directly support denormalized object linking.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Say you have two entities, an Address:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "1234567",
"type": "Address",
"addr1": "1 thing st",
"city": "stuffville",
"zipcode": 54321
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
and a Person:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "897654",
"type": "Person",
"name": "Fred"
"address": // put the address with key "1234567" here
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
You'd like to store the Person so the correct linked address is there; not just the key, but the values (type, addr1, city, zipcode).</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
If you store the Person as:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "897654",
"type": "Person",
"name": "Fred",
"address": {"key": "1234567"}
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
then this will automatically be expanded to</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "897654",
"type": "Person",
"name": "Fred",
"address":
{
"key": "1234567",
"type": "Address",
"addr1": "1 thing st",
"city": "stuffville",
"zipcode": 54321
}
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Furthermore, gaedocstore will update these values if you change address. So if address changes to:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "1234567",
"type": "Address",
"addr1": "2 thing st",
"city": "somewheretown",
"zipcode": 12345
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
then the person will automatically update to</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "897654",
"type": "Person",
"name": "Fred",
"address":
{
"key": "1234567",
"addr1": "2 thing st",
"city": "somewheretown",
"zipcode": 12345
}
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Denormalized Object Linking also supports <a href="https://github.com/emlynoregan/pybOTL" style="border: 0px; color: #4183c4; margin: 0px; padding: 0px; text-decoration: initial;">pybOTL transform templates</a>. gaedocstore can take a list of "name", "transform" pairs. When a key appears like</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
...
"something": { key: XXX },
...
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
then gaedocstore loads the key referenced. If found, it looks in its list of transform names. If it finds one, it applies that transform to the loaded object, and puts the output into the stored GDSDocument. If no transform was found, then the entire object is put into the stored GDSDocument as described above.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
eg:</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Say we have the transform "address" as follows:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">ltransform = {
"fulladdr": "{{.addr1}}, {{.city}} {{.zipcode}}"
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
You can store this transform against the name "address" for gaedocstore to find as follows:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">GDSDocument.StorebOTLTransform("address", ltransform)
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Then when Person above is stored, it'll have its address placed inline as follows:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "897654",
"type": "Person",
"name": "Fred",
"address":
{
"key": "1234567",
"fulladdr": "2 thing st, somewheretown 12345"
}
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
An analogous process happens to embedded addresses whenever the Address object is updated.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
You can lookup the bOTL Transform with:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">ltransform = GDSDocument.GetbOTLTransform("address")
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
and delete it with</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">GDSDocument.DeletebOTLTransform("address")
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Desired feature (not yet implemented): If the template itself is updated, then all objects affected by that template are also updated.</div>
<h3 style="-webkit-font-smoothing: antialiased; border: 0px; color: #333333; cursor: text; font-size: 18px; margin: 20px 0px 10px; padding: 0px; position: relative;">
<a class="anchor" href="https://github.com/emlynoregan/gaedocstore/edit/master/README.md#deletion" name="deletion" style="border: 0px; bottom: 0px; color: #4183c4; cursor: pointer; display: block; left: 0px; margin: 0px 0px 0px -30px; padding: 0px 0px 0px 30px; position: absolute; text-decoration: initial; top: 0px;"></a>Deletion</h3>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
If an object is deleted, then all denormalized links will be updated with a special key "link_missing": True. For example, say we delete address "1234567" . Then Person will become:</div>
<pre style="background-color: #f8f8f8; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 13px; font-weight: normal; line-height: 19px; margin-bottom: 15px; margin-top: 15px; overflow: auto; padding: 6px 10px;"><code style="background-color: transparent; border-bottom-left-radius: 3px; border-bottom-right-radius: 3px; border-top-left-radius: 3px; border-top-right-radius: 3px; border: none; font-family: Consolas, 'Liberation Mono', Courier, monospace; font-size: 12px; margin: 0px; padding: 0px;">{
"key": "897654",
"type": "Person",
"name": "Fred",
"address":
{
"key": "1234567",
"link_missing": True
}
}
</code></pre>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
And if the object is recreated in the future, then that linked data will be reinstated as expected.</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; margin-top: 15px; padding: 0px;">
Similarly, if an object is saved with a link, but the linked object can't be found, "link_missing": True will be included as above.</div>
<h3 style="-webkit-font-smoothing: antialiased; border: 0px; color: #333333; cursor: text; font-size: 18px; margin: 20px 0px 10px; padding: 0px; position: relative;">
<a class="anchor" href="https://github.com/emlynoregan/gaedocstore/edit/master/README.md#updating-denormalized-linked-data-back-to-parents" name="updating-denormalized-linked-data-back-to-parents" style="border: 0px; bottom: 0px; color: #4183c4; cursor: pointer; display: block; left: 0px; margin: 0px 0px 0px -30px; padding: 0px 0px 0px 30px; position: absolute; text-decoration: initial; top: 0px;"></a>updating denormalized linked data back to parents</h3>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
The current version does not support this, but in a future version we may support the ability to change the denormalized information, and have it flow back to the original object. eg: you could change addr1 in address inside person, and it would fix the source address. Note this wont work when transforms are being used (you would need inverse transforms).</div>
<h3 style="-webkit-font-smoothing: antialiased; border: 0px; color: #333333; cursor: text; font-size: 18px; margin: 20px 0px 10px; padding: 0px; position: relative;">
<a class="anchor" href="https://github.com/emlynoregan/gaedocstore/edit/master/README.md#storing-deltas" name="storing-deltas" style="border: 0px; bottom: 0px; color: #4183c4; cursor: pointer; display: block; left: 0px; margin: 0px 0px 0px -30px; padding: 0px 0px 0px 30px; position: absolute; text-decoration: initial; top: 0px;"></a>storing deltas</h3>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-bottom: 15px; padding: 0px;">
I've had a feature request from a friend, to have a mode that stores a version history of all changes to objects. I think it's a great idea. I'd like a strongly parsimonious feel for the library as a whole: it should just feel like "ndb with benefits").</div>
<div style="border: 0px; color: #333333; font-size: 14px; font-weight: normal; line-height: 22px; margin-top: 15px; padding: 0px;">
<br /></div>
Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-40366624533333142542011-12-29T19:59:00.003+10:302011-12-31T19:07:48.994+10:30It's so, like, social!<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtLTpVUhSRF1A3ZyHP2sFCRNRcbletVzAwlF5xBnrQxJCXHdrlJ7-BfD13UtcaCGDXmTDdFuVm7ziyzz0c5Z_lsxydHH0sKBYS5lcEs_GJQ1LDhddUWLd35chgRn6JpMbFFnzUn4S2upc/s1600/social-code-1024x385.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="120" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtLTpVUhSRF1A3ZyHP2sFCRNRcbletVzAwlF5xBnrQxJCXHdrlJ7-BfD13UtcaCGDXmTDdFuVm7ziyzz0c5Z_lsxydHH0sKBYS5lcEs_GJQ1LDhddUWLd35chgRn6JpMbFFnzUn4S2upc/s320/social-code-1024x385.jpg" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="tr_bq">
Ok, so I know I'm supposed to do the second half of the post about the REST interface, but I got distracted.</div>
<br />
You see, I'm on holiday.<br />
<br />
On holiday, I like to do something just crazy and interesting. And not too much work ;-)<br />
<br />
Anyway, what I've built is a very alpha-ish prototype of a concept for Social Coding.<br />
<br />
The original idea for this came to me in September. <a href="https://plus.google.com/100281903174934656260/posts/W2oXq7qwMcC" target="_blank">I posted about it on Google+</a>. Here's the content of the post:<br />
<blockquote>
<span style="color: #444444;"><span style="background-color: white;">Imagine a web app for social coding.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">You get there via a link from somewhere else. At the top of the page in big typeface is the name of a function, the signature (parameters, return type) and a short description.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">Then, side by side, are two columns.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">On the left, there is a list of unit test implementations. Each is just a slab of code with asserts in it.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">On the right there is a list of implementations.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">Each unit test and each implementation can be commented on, and can be modded up or down.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">Unit tests are listed in two sections; "Working" at the top and and "Broken" at the bottom. The "Broken" section lists tests that are syntactically busted.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">Within the sections, the unit tests are listed in order of moderation score.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">Implementations are listed in order of the number of unit tests passed. Within the same number, they are listed in moderation order.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">You are free to<br /> </span><span style="background-color: white;">- Add a new unit test (possibly based on an existing test)</span><span style="background-color: white;">- Add a new implementation (possibly based on an existing implementation)</span><span style="background-color: white;">- Moderate other tests and implementations</span><span style="background-color: white;">- Comment on anything</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">You can share the page on social networks and so forth with the usual buttons.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">The owner of the page is the person who originally defined the function, including description. The owner can modify the definition, can mark any unit tests and implementations as not acceptable.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">The function is considered implemented if there is an acceptable implementation that passes all acceptable, working unit tests. An implemented function may still be re-implemented at any time in the future, or become unimplemented by the addition of new tests which it fails.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">When writing a test or an implementation, you may define new methods that don't yet exist. All user defined methods behave as links to their own page. Methods that don't have a page have one created for them when first clicked, wiki-like.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">You can also define a new function directly, without actually using it in another function.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">A method which relies on other methods which are not considered implemented is also not considered implemented.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">A unit test which relies on methods which are not considered implemented is part of the "broken" group.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">Analogs of this function definition mechanism should be created for object oriented Classes and for html Pages.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"><br /></span><span style="background-color: white;">If, say, both Javascript and Python are allowed, then you could build a google AppEngine app this way.</span><span style="background-color: white;"><br /></span><span style="background-color: white;"></span><span style="background-color: white;"></span><span style="background-color: white;"></span><span style="background-color: white;"></span><span style="background-color: white;"><br /></span><span style="background-color: white;">Thoughts?</span></span></blockquote>
A long chat ensued. A bit of enthusiasm flared up, and then went nowhere, because I did nothing with the idea. Yeah!<br />
<br />
Anyway, now I have. I've built a prototype of this idea, in fact, called "Social Code". What a creative name!<br />
<br />
It's hosted here: <a href="http://socialcode.emlynhrdtest.appspot.com/">http://socialcode.emlynhrdtest.appspot.com/</a><br />
<br />
The code is in github here: <a href="https://github.com/emlynoregan/Social-Code">https://github.com/emlynoregan/Social-Code</a><br />
<br />
Basically, it works like this:<br />
<br />
<ul>
<li>Anyone can log in with a google account.</li>
<li>You can create functions. Functions are </li>
<ul>
<li>A name (purely for descriptive purposes)</li>
<li>A slab of python code, hopefully containing a function implementation, but can be anything really.</li>
<li>Another slab of python code, Tests, which should be tests of the function implementation, but again can be anything. When the user hits "Save and Run!", this will be run and will be successful if no errors/exceptions are thrown, fails otherwise.</li>
<li>A list of past runs, with results & logs.</li>
</ul>
<li>Functions can refer to each other</li>
<ul>
<li>Say I've separately defined the function "Add"</li>
<li>If you want to call "Add" in your code, you need to include {{Add}} somewhere in your implementation code, probably in a comment. When you save it, it'll be parsed out, and appear in the "Imports" list.</li>
<li>You can call anything defined in "Add" in your code. This might include a python function called "Add", but only by convention.</li>
</ul>
</ul>
<div>
For a really simple example of the way this hangs together, select the function <a href="http://socialcode.emlynhrdtest.appspot.com/function?id=34002" target="_blank">Multiply</a>, see what's going on, then click through to dependencies etc.</div>
<div>
<br />
Now, anyone can change any of the tests or any of the implementations of any functions, delete any functions, add new ones. That's all so open that the whole thing's only usable as a curiosity at the moment.</div>
<div>
<br /></div>
<div>
In fact my current implementation is really daggy. Not a stick of javascript (or there is, there's an onclick confirm function for the delete button). So mondo page refreshes. Also no design, some css where necessary, inline (gasp!).<br />
<br />
And OMG you'll hate editing code in textareas.</div>
<div>
<br /></div>
<div>
Also the "Save and Run!" button just cobbles all the code (tests + implementation + dependencys' implementations) into one big slab o' text and calls "exec" on it. Not terribly sophisticated. It'll need eventually to do stuff like run in an appengine task, probably work with code in a more sophisticated manner (should I be doing something with code objects?) etc. </div>
<div>
<br /></div>
<div>
I've put the code in a serious sandbox (I think), by not allowing the __builtin__ collection to be in scope. You can't do a python import. So, you can only currently do simple things, intentionally. Opening this up in some way will be necessary, but it'll need some serious discussion, and care I think.</div>
<div>
<br /></div>
<div>
But it gives you something to play with, get a feel for the kind of thing I'm thinking about.</div>
<div>
<br /></div>
<div>
The intention is that, instead of one slab of code for "Implementation" and one for "Tests", we'll have many competing Implementations and many competing Tests. There'll be a concept of the current implementation for any function, which is one that passes all tests, which is voted up by the most users, and probably which is newest, in case of ties. That's the one that other dependent functions will import.</div>
<div>
<br /></div>
<div>
I think potential implementations need to be immutable. If you want to edit an implementation, you actually copy it and edit the copy. Might need a draft/published mechanism to help with this. So you see all the published implementations, and all your own drafts. Published implementations are immutable, drafts are not.</div>
<div>
<br /></div>
<div>
I don't know if/how this applies to tests. </div>
<div>
<br /></div>
<div>
Also, I think functions should be able to define acceptance tests on other functions. If B depends on A (imports it) then B should be able to define acceptance tests on A. If A changes such that it has new implementation and tests, and passes those tests, but no longer is suitable for B, then B's acceptance tests will fail and flag that B needs fixing.</div>
<div>
<br /></div>
<div>
Of course the current giant-slab-o-tests approach could accommodate this; in the tests for B, just define some tests of A.</div>
<div>
<br /></div>
<div>
Anyway, have a play, <a href="https://plus.google.com/100281903174934656260/posts/W2oXq7qwMcC" target="_blank">read the old google+ post and comments</a>, and come <a href="https://plus.google.com/100281903174934656260/posts/ijawMjyLxmT" target="_blank">chat on the new google+ post</a> where hopefully people will be yammering about this.</div>Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-36450032204447123802011-12-11T18:56:00.001+10:302011-12-11T23:10:00.770+10:30Todo: Rest<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6qydbeC3OypRGVojXY937pnr60J-Q89giV8Fxpk9NKzBmhbTO2s3j-GK1-zHMX3o81mJtG1dydcmDlr6g2ESlArxmV0ReLJ0mqCRxKceP0pZDefgWn5Wzymh682V3NPOkBnkxbItScE4/s1600/todorest.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="204" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6qydbeC3OypRGVojXY937pnr60J-Q89giV8Fxpk9NKzBmhbTO2s3j-GK1-zHMX3o81mJtG1dydcmDlr6g2ESlArxmV0ReLJ0mqCRxKceP0pZDefgWn5Wzymh682V3NPOkBnkxbItScE4/s320/todorest.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
First of all, by way of apology for the chirping crickets on this blog, let me say I've been otherwise occupied.<br />
<br />
Not busy. I don't have 7 different things to get done by lunch time. I keep my schedule as simple as possible; usually it's get up, get it together, go to work, do cool stuff, come home, do family stuff, crash. Basically my policy is, if I have two things scheduled in a day, that's one too many. My brain just doesn't like that.<br />
<br />
But occupied, yep. And that's because I changed jobs, threw out pretty much my entire existing skillset, and started from scratch.<br />
<br />
Well, not the whole lot. It's still code. Of course. The world is basically composed of stuff that is code, and stuff that is muttering, and I try to steer clear of muttering. But still, if you're a professional senior C# dev, and have been doing windows coding for over a decade, then it turns out in retrospect to be a thing of some gravity to chuck all that in and jump into the world of LAMP, eschew that and recommend AppEngine, then go hell for leather on rich modern javascript apps on this platform, with all open source tools.<br />
<br />
A great and terrible thing.<br />
<br />
(and of course some obligatory PHP support is involved, which is pure masochism).<br />
<br />
Luckily <a href="http://ecampus.com.au/">my new job is awesome</a>, for a little company with brilliant people and the desire to build a really first class dev team, building outstanding stuff. And a healthy hatred for Microsoft (well, you know, we'll integrate with their stuff and all that, but we wouldn't use it ourselves, bleh). The very fact that I can blog about the details of what I've been building commercially, and even release some of it, should say something positive about it.<br />
<br />
Also luckily, I'm working with <a href="http://splunderousnoog.com/">a first rate front end developer</a>, who's all about UX, who is all over javascript and jquery and underscore and backbone and etc etc etc. And that's because I'm mediocre at best at UI dev. I can never take it seriously; making computers easier for the illiterate feels somehow like enabling an alcoholic, and that's just not really in my nature. Figure it out you useless bastards, really.<br />
<br />
Anyway, long story stupidly too long.<br />
<br />
The crux of it is, I've been trying to learn to do AppEngine well, to do Python well, to understand the crazy (and rather spectacular) world of kludge that is modern javascript programming; more like Escher than Kafka I think, but I'm not sure. Javascript is oddly beautiful actually, and it's lovely to see functional programming burst onto the mainstream(ish) scene finally, as confused as it currently is with OO.<br />
<br />
So my mind has been almost entirely occupied with this. I've been frenetic, I've been focused. Not in the "now I'm really paying attention" way, but more in the <a href="http://en.wikipedia.org/wiki/A_Deepness_in_the_Sky">Vernor Vinge's "A Deepness In The Sky"</a> way. Drooling and a bit broken, looked after by my darling wife.<br />
<br />
So just as I'm making some breakthroughs, starting to Get It and feel a bit more comfortable, I got sick for a couple of days. Flu/Cold/Virusy thing. Bedridden, looked after by my darling wife.<br />
<br />
All that was on my todo list was to rest.<br />
<br />
And, I'm not making this up, what came to my mind was "Todo, rest, hey I could get that todo list sample app and make it talk to AppEngine via a RESTful interface".<br />
<br />
So I built a little REST library, called Sleepy.py, while lying in my sick bed.<br />
<br />
As I said, drooling and broken.<br />
<br />
---<br />
<br />
I've not come at Sleepy cold. Sleepy is actually a rewrite of a rewrite.<br />
<br />
What I've been mostly doing commercially in the last few weeks has been to build a decent REST interface for AppEngine. Which of course I shouldn't be doing; just grab one that's already there. The problem is, I haven't been able to find anything serviceable. I looked, believe me, and even tried some out, but they just felt clunky; heavyweight, XML oriented (we want to use JSON), not really welcoming customization, a square peg for a round hole.<br />
<br />
I thought we should stick with the existing options, because look how big and complex they are, we don't want to do all that work ourselves! But my good colleague protested, went away one weekend and came back with his own simple REST interface library, which was cleaner, much shorter, and really a bit of a revelation. It needed work, but that opened my eyes to his central observation, which was that this doesn't need to be that hard.<br />
<br />
And it doesn't, of course. I mean, what is REST really?<br />
<br />
<b>What is REST?</b><br />
<br />
Let me google that for you:<br />
<br />
<a href="http://en.wikipedia.org/wiki/Representational_state_transfer">http://en.wikipedia.org/wiki/Representational_state_transfer</a><br />
<br />
<b>That answer sucks</b><br />
<br />
Ok, then what REST means to me, in a practical sense, is an HTTP based API, using the big four requests (GET, PUT, POST, DELETE), which suits heavy javascript clients (ie: talks JSON), and which lets the client side talk to my server side datamodel in some straightforward way.<br />
<br />
Now I know that's hopelessly non-purist, but what I'm trying to achieve is this:<br />
<br />
- We've got a rough idea for an app to build.<br />
- I build a server side, which is largely data model.<br />
- Someone who is good at it builds a client side javascript app, replete with its own local data model, which must be synced somehow with the server side. That somehow is something very like backbone.js's expectation of a RESTful web api to talk to.<br />
- We'd also like the possibility of other stuff talking to the same api (iPad apps, third party code, martian mind control beams, whatever).<br />
- I must be able to set up a RESTful web api to let the client talk to the server. It needs to give a good balance of simplicity and customizable behaviour, be quick to put in place, and just be clean and easy to work with.<br />
<br />
<b>Anyway enough jibber jabber, where's some code?</b><br />
<br />
Yeah yeah ok.<br />
<br />
What I decided to do here to demonstrate my approach is to take a simple sample app that demonstrates the use of backbone.js , and make it work in the context of AppEngine. The sample app, by <a href="http://jgn.me/" style="background-color: #f4f4f4; color: black; font-family: 'Helvetica Neue', Helvetica, Arial; font-size: 14px; line-height: 22px;">Jérôme Gravel-Niquet</a>, is a todo list which lets the user make a simple but nice to use todo list, and stores it in local storage in the browser (so it really has no back end).<br />
<br />
You can check out the sample app here: <a href="http://documentcloud.github.com/backbone/#examples">http://documentcloud.github.com/backbone/#examples</a><br />
<br />
Have a play with it here: <a href="http://documentcloud.github.com/backbone/examples/todos/index.html">http://documentcloud.github.com/backbone/examples/todos/index.html</a><br />
<br />
<b>Starting off</b><br />
<br />
So I started by downloading all the static files for the todo list app, sticking them in a basic appengine app, and popping that in git on github.<br />
<br />
The git repo is here: <a href="https://github.com/emlynoregan/appenginetodos/tags">https://github.com/emlynoregan/appenginetodos</a><br />
<br />
The first version that just hosts the static app, still backed by localstorage, is available at the tag v1:<br />
<a href="https://github.com/emlynoregan/appenginetodos/tags">https://github.com/emlynoregan/appenginetodos/tags</a><br />
<br />
You can run that in your local appengine devserver, or upload it to appengine proper, and it'll work. But it's not touching the datastore.<br />
<br />
<b>Datamodel</b><br />
<br />
The app is a simple todo list. There's no concept of users, of different lists, nothing. There's just one list, which is a list of Todo items. A todo item is composed of three elements:<br />
<br />
<ul>
<li>text - string, the text that the user enters describing what is to be done </li>
<li>order - int. the numerical order in the list of this item; lower number comes first.</li>
<li>done - bool, whether the item is done or not</li>
</ul>
<div>
The first thing we need is a datastore model to describe this. This is a simple one:</div>
<div>
<pre>from google.appengine.ext import db
<span class="Apple-style-span" style="background-color: #cfe2f3;">class ToDo(db.Model):
text = db.StringProperty()
order = db.IntegerProperty()
done = db.BooleanProperty()
created = db.DateTimeProperty(auto_now_add = True)
modified = db.DateTimeProperty(auto_now = True)
def __init__(self, *args, **kwargs):
db.Model.__init__(self, *args, **kwargs)
if self.done is None:
self.done = False
</span></pre>
</div>
<div>
<br />
Just one class, ToDo. It's got text, order and done (and I've put a default value on "done" of false in the constructor). I've also added "created" and "modified" fields just for a bit of fun. These are managed by the server (using auto_now_add and auto_now), and wont be exposed to the client.<br />
<br />
<b>And now the REST!</b><br />
<br />
Ok, now we want to expose this datamodel class through a REST api. For the ToDo class, we'll use a resource name of /todos. Locally this might be http://localhost:8080/todos. My version on appengine has the url <a href="http://todos.emlynhrdtest.appspot.com/todos">http://todos.emlynhrdtest.appspot.com/todos</a> .<br />
<br />
The first thing to think about is the routing (kicking off the right handlers for urls) and basic infrastructure. How will we get /todos to be handled the way we want to handle it, and how would that handling be done in any case?<br />
<br />
I figured that since the RESTful web api will be just like a normal web page, but returning JSON, why not use the standard webapp web handler mechanism already available in appengine, and the same routing mechanism? Now we'll want a little help with the routing, because we want to be able to manage /todos, but also calls like /todos/32 (where 32 is a valid id for a model instance) to address particular todo items.<br />
<br />
So, you define the routes by providing a list of resourcename / handler pairs, which a helper function then turns into url / handler pairs as required by the wsgi application.<br />
<br />
Here's the main.py, showing a standard route for '/', then including the calculated restRoutes.
<br />
<br />
<pre><span class="Apple-style-span" style="background-color: #cfe2f3;">import webapp2
import htmlui
import restapi
# basic route for bringing up the app
lroutes = [ ('/', htmlui.ToDoHandler) ]
# add api routes, see restapi/__init__.py
lroutes.extend(restapi.restRoutes)
# create the application with these routes
app = webapp2.WSGIApplication(lroutes, debug=True)
</span></pre>
<br />
restRoutes comes from the restapi module for this app, whose __init__.py looks like this:<br />
<br />
<pre><span class="Apple-style-span" style="background-color: #cfe2f3;">from todoresthandler import *
from sleepy import *
restRoutes = [
('todos', ToDoRestHandler)
]
restRoutes = Sleepy.FixRoutes(restRoutes)
</span></pre>
<br />
ToDoRestHandler is a webapp webhandler for the rest api for ToDo. Sleepy.FixRoutes turns restRoutes from a list of (resourcename, handler) pairs to a list of (url, handler) pairs.<br />
<br />
<pre><span class="Apple-style-span" style="background-color: #cfe2f3;"> @classmethod
def FixRoutes(cls, aRoutes, aRouteBase = None):
"""
Modifies routes to allow specification of id
aRoutes should be pairs of resourcename, resource handler.
This is modified to become pairs of route source, resource handler.
aRouteBase is anything you want to prepend all route sources with.
eg: if you want all your route sources to begin with /api, </span></pre>
<pre><span class="Apple-style-span" style="background-color: #cfe2f3;"> use aRouteBase="/api"
Don't include a trailing slash in aRouteBase.
"""
retval = []
for lroute in aRoutes:
lfixedRouteSource = '/(%s)(?:/(.*))?' % lroute[0]
if aRouteBase:
lfixedRouteSource = aRouteBase + lfixedRouteSource
lfixedRoute = (lfixedRouteSource, lroute[1])
retval.append(lfixedRoute)
return retval
</span></pre>
<br />
<br />
We've got the /todos route wired up to ToDoRestHandler. But what's in that? Here it is:<br />
<br />
<pre><span class="Apple-style-span" style="background-color: #cfe2f3;">from google.appengine.ext import webapp
from sleepy import Sleepy
from datamodel import ToDo
class ToDoRestHandler(webapp.RequestHandler):
def get(self, aResource, aResourceArg, *args, **kwargs):
Sleepy.GetHandler(self, aResource, aResourceArg, *args, **kwargs)
def put(self, aResource, aResourceArg, *args, **kwargs):
Sleepy.PutHandler(self, aResource, aResourceArg, *args, **kwargs)
def post(self, aResource, aResourceArg, *args, **kwargs):
Sleepy.PostHandler(self, aResource, aResourceArg, *args, **kwargs)
def delete(self, aResource, aResourceArg, *args, **kwargs):
Sleepy.DeleteHandler(self, aResource, aResourceArg, *args, **kwargs)
def GetModelClass(self):
return ToDo</span>
</pre>
<br />
What's going on here? Well, I've delegated all the functionality of GET, PUT, POST and DELETE to the Sleepy library. Plus, there's an extra method GetModelClass which tells us (actually Sleepy) which model class we're working with.<br />
<br />
<b>So basically you've shown us nothing. What's in this Sleepy?</b><br />
<br />
No I haven't, have I? I'll get onto what's in Sleepy in the next post. For now, if you want to skip ahead, just check out the <a href="https://github.com/emlynoregan/appenginetodos">git repo</a>. Otherwise, you can wait for the next installment, lazy person!</div>Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-18118261667718357572011-10-23T23:24:00.001+10:302011-10-23T23:57:06.376+10:30The Dining Philosophers<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrqTM1e4LmbokQ2p7UxCaWZZH8Yk0Fv2UTrD2jo3RjIX1mPxfuh3L5IW9KCPDglNZJec5oDpAumvkyriXuhKP5y0kDWBfqm34Pk2OVq4qDAXowEptvclDkELrP7qLy01Nhg-udX5WAzmY/s1600/200px-Dining_philosophers.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrqTM1e4LmbokQ2p7UxCaWZZH8Yk0Fv2UTrD2jo3RjIX1mPxfuh3L5IW9KCPDglNZJec5oDpAumvkyriXuhKP5y0kDWBfqm34Pk2OVq4qDAXowEptvclDkELrP7qLy01Nhg-udX5WAzmY/s1600/200px-Dining_philosophers.png" /></a></div>
<br />
<i><a href="http://en.wikipedia.org/wiki/Dining_philosophers_problem">http://en.wikipedia.org/wiki/Dining_philosophers_problem</a></i><br />
<br />
<i>Five silent philosophers sit at a table around a bowl of spaghetti. A fork is placed between each pair of adjacent philosophers.</i><br />
<i>Each philosopher must alternately think and eat. Eating is not limited by the amount of spaghetti left: assume an infinite supply. However, a philosopher can only eat while holding both the fork to the left and the fork to the right (an alternative problem formulation uses rice and chopsticks instead of spaghetti and forks).</i><br />
<i>Each philosopher can pick up an adjacent fork, when available, and put it down, when holding it. These are separate actions: forks must be picked up and put down one by one.</i><br />
<i>The problem is how to design a discipline of behavior (a concurrent algorithm) such that each philosopher won't starve, i.e. can forever continue to alternate between eating and thinking.</i><br />
<br />
Ok, so I said in <a href="http://appenginedevelopment.blogspot.com/2011/10/not-your-mamas-web-server.html">the previous post</a> that AppEngine is a massive distributed machine. So on such a machine, we should be able to implement a solution to The Dining Philosophers Problem. How would we go about it?<br />
<br />
<b>Implementation of Semaphores</b><br />
<br />
Firstly, we need a working Semaphore implementation. In <a href="http://appenginedevelopment.blogspot.com/2011/10/not-your-mamas-web-server.html">the previous post</a> I sketched out an approach, but since then I've built a proper working version.<br />
<br />
Find the full Semaphore implementation here: <a href="https://github.com/emlynoregan/AppEngineDevelopment/blob/master/src/Semaphore.py">Semaphore.py on github</a><br />
<br />
<pre>class Semaphore(polymodel.PolyModel):
_counter = db.IntegerProperty()
_suspendList = db.StringListProperty()
def ConstructSemaphore(cls, aCounter):
retval = cls()
retval._counter = aCounter
retval._suspendList = []
return retval
ConstructSemaphore = classmethod(ConstructSemaphore)
def Wait(self, obj, *args, **kwargs):
while True:
try:
lneedsRun = db.run_in_transaction(
_doWait,
self.key(),
obj, *args, **kwargs
)
if lneedsRun:
try:
obj(*args, **kwargs)
except Exception, ex:
logging.error(ex)
break
except TransactionFailedError:
# do nothing
logging.warning("TransactionFailedError in Wait, try again")
def Signal(self):
while True:
try:
db.run_in_transaction(_doSignal, self.key())
break
except TransactionFailedError:
# do nothing
logging.warning("TransactionFailedError in Signal, try again")
def _doWait(aKey, aObj, *args, **kwargs):
lneedsRun = False
lsem = db.get(aKey)
if not lsem:
raise Exception("Internal: failed to retrieve semaphore in _doWait")
if lsem._counter > 0:
lsem._counter -= 1
logging.debug("counter: %s" % lsem._counter)
lneedsRun = True
else:
logging.debug("about to defer")
pickled = deferred.serialize(aObj, *args, **kwargs)
pickled = base64.encodestring(pickled)
logging.debug("after defer, pickled=%s" % pickled)
lsem._suspendList.append(pickled)
lsem.put()
return lneedsRun
def _doSignal(aKey):
lsem = db.get(aKey)
if not lsem:
raise Exception("Internal: failed to retrieve semaphore in _doSignal")
if len(lsem._suspendList) > 0:
logging.debug("about to unsuspend")
pickled = lsem._suspendList.pop()
pickled = base64.decodestring(pickled)
logging.debug("pickled=%s" % pickled)
# here I depickle the pickled call information, only to repickle inside
# deferred.defer. Not ideal, but cleaner given the calls available
# inside deferred.
try:
obj, args, kwds = pickle.loads(pickled)
except Exception, e:
raise deferred.PermanentTaskFailure(e)
logging.debug("about to defer")
deferred.defer(obj, _transactional=True, *args, **kwds)
#deferred.run(pickled)
logging.debug("after defer")
else:
lsem._counter += 1
lsem.put()
</pre>
<br />
<br />
What I wrote last time was basically correct. I've using <a href="http://blog.notdot.net/">Nick Johnson's</a> deferred.defer library to make the adding and running of tasks smoother (ie: hide the use of web handlers).<br />
<div>
<br /></div>
<div>
Some points of interest:<br />
<br />
The _suspendList of Semaphore is a stringlist, which stores the same pickled call information that deferred.defer uses. I've made liberal use of internal functions of deferred to make this work, basically because that library is written by someone far more competent with Python than me, so why not?</div>
<div>
<br /></div>
<div>
Wait() takes obj, *args, **kwargs and passes them to the call of _doWait() in a transaction. obj, *args and **kwargs define the call to make once we have successfully acquired the semaphore. So, inside _doWait(), we'll either need to add obj (et al) to the _suspendList, or, if we can get the semaphore immediately, we need to call obj(). However, we can't call it inside _doWait() because it's inside a transaction - we don't want or need to be inside the context of that transaction for the call to obj(). So instead, _doWait returns a bool, which is True in the case that we've aquired the semaphore, and you'll see that Wait checks this result, and calls obj() immediately if it's True. Thus, obj() is called if necessary, but outside the transaction.<br />
<br />
The pickled call information, created by deferred.serialize(), isn't safe to save in a stringlist (illegal characters, throws errors). So, I base64 encode it before saving it in _doWait(). You'll see in _doSignal that the pickled call information is base64 decoded.<br />
<br />
You'll notice that inside _doSignal(), when there are waiting tasks on the _suspendList, that I dequeue one (the last one, but order is irrelevant in a Semaphore implementation), unpickle it, but I don't call it. Instead, I add a task for it using deferred.defer(). I don't call it because the task we are in has just finished doing work while holding the semaphore, which might have been lengthy. These are short lived tasks, we shouldn't do more than one user-defined thing in one task. So, instead of running this second thing (the dequeued suspended task), I reschedule it to run immediately in another task. Note also that I mark that defer as transactional; it means that if the signal transaction fails, the task wont be enqueued, which is what we want.<br />
<br />
One last note: In case it's not obvious, the transactions combined with a reload of the Semaphore ensure that we can safely use Semaphore objects even if they are stale. So don't worry about stale Semaphores causing contention issues. This is explained more in the previous post.<br />
<br />
Oh, and I think if you pass a queue name into Wait() in the same way you would to a regular call to deferred.defer() (ie: a parameter _queue="somequeuename"), the semaphore will use that queue instead of default, which might be handy.<br />
<br />
<b>Testing the Semaphores</b><br />
<br />
I've got some simple test code in Semaphore.py for giving Semaphores a run.<br />
<br />
<span class="Apple-style-span" style="font-family: monospace; white-space: pre;">def SemaphoreTest1():</span></div>
<pre> logging.info("*****************************")
logging.info("** BEGIN SEMAPHORETEST1 **")
logging.info("*****************************")
lsem = Semaphore.ConstructSemaphore(2)
lsem.put()
lcount = 0
while lcount < 20:
deferred.defer(SemaphoreTest1EntryPoint, lsem.key(), lcount, True)
lcount += 1
def SemaphoreTest1EntryPoint(aKey, aNum, aFirst):
lsem = db.get(aKey)
if not lsem:
raise Exception("Failed to retrieve semaphore in EntryPoint1")
if aFirst:
# this is before we've got the semaphore
logging.info("Before Wait for %s" % aNum)
lsem.Wait(SemaphoreTest1EntryPoint, aKey, aNum, False)
else:
# we now have the semaphore
logging.info("Begin Critsec for %s" % aNum)
sleep(2) # stay inside critsec for 2 seconds
logging.info("End Critsec for %s" % aNum)
lsem.Signal()
logging.info ("After Signal for %s" % aNum)
</pre>
<br />
SemaphoreTest1() creates a new Semaphore with counter=2 (ie: allow max 2 tasks to be inside the Semaphore at any time), and schedules 20 tasks to run immediately, which all run SemaphoreTestEntryPoint() with the new Semaphore (passed by key in the aKey parameter).<br />
<br />
SemaphoreTestEntryPoint() loads the Semaphore, then takes one of two paths. The first time through aFirst is True (we don't hold the semaphore yet); it waits on the semaphore (switching aFirst to False). The second time through, with aFirst as False, where we hold the Semaphore, it sleeps for a couple of seconds, then signals the Semaphore and exits.<br />
<br />
The upshot of this is that the 20 tasks, all run at the same time, will contend for the Semaphore. The first two to get it will sit inside it for 2 seconds (a long time, while the other tasks keep waiting on it and suspending). Eventually these tasks holding it will signal it and exit. Each time a task signals, it'll kick off a waiting one, which in turn will get it, again taking two seconds, then exit. And do on until there are no more tasks left.<br />
<br />
Try running this code, and fiddling with the parameters a bit. Note that <a href="https://github.com/emlynoregan/AppEngineDevelopment/tree/master/src">you'll find the whole project on GitHub, here</a>.<br />
<br />
<b>Back to Dinner</b><br />
<br />
So we've got a working Semaphore. So how do we implement a solution to the dining philosophers?<br />
<br />
Let's implement the classic deadlock algorithm first. It goes like this:<br />
<span class="Apple-style-span" style="background-color: white; font-family: sans-serif; font-size: 13px; line-height: 19px;"></span><br />
<ul style="line-height: 1.5em; list-style-image: url(data:image/png; list-style-type: square; margin-bottom: 0.5em; margin-left: 1.5em; margin-right: 0px; margin-top: 0.3em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
<li style="margin-bottom: 0.1em;">think until the left fork is available; when it is, pick it up;</li>
<li style="margin-bottom: 0.1em;">think until the right fork is available; when it is, pick it up</li>
<li style="margin-bottom: 0.1em;">eat</li>
<li style="margin-bottom: 0.1em;">put the left fork down</li>
<li style="margin-bottom: 0.1em;">put the right fork down</li>
<li style="margin-bottom: 0.1em;">repeat from the start</li>
</ul>
First we need to represent a Fork. I'll use a Semaphore with counter=1, ie: while someone holds the fork, no one else may hold the fork:<br />
<br />
<pre>class Fork(Semaphore.Semaphore):
def ConstructFork(cls):
lfork = cls.ConstructSemaphore(1)
return lfork
ConstructFork = classmethod(ConstructFork)
</pre>
<br />
I used a polymodel for Semaphore, so we can override it as above.<br />
<br />
Next is an implementation of the faulty algorithm above. ThinkAndEatByKey is a wrapper of ThinkAndEat, which takes Forks by key rather than by object reference, loads them, and calls through. The real work happens in ThinkAndEat.<br />
<br />
<pre>def ThinkAndEatByKey(aFirstForkKey, aSecondForkKey, aIndex, aNumLoops, aHasFirst, aHasSecond):
lFirstFork = db.get(aFirstForkKey)
if not lFirstFork:
raise Exception("Failed to retrieve Left Fork")
lSecondFork = db.get(aSecondForkKey)
if not lSecondFork:
raise Exception("Failed to retrieve Right Fork")
ThinkAndEat(lFirstFork, lSecondFork, aIndex, aNumLoops, aHasFirst, aHasSecond)
def ThinkAndEat(aFirstFork, aSecondFork, aIndex, aNumLoops, aHasFirst=False, aHasSecond=False):
if not aHasFirst:
# this is before we've got the semaphore
logging.info("Wait on first for %s" % aIndex)
aFirstFork.Wait(ThinkAndEatByKey, aFirstFork.key(), aSecondFork.key(), aIndex, aNumLoops, True, False)
elif not aHasSecond:
sleep(10) # takes a while to pick up the second fork!
logging.info("Wait on second for %s" % aIndex)
aSecondFork.Wait(ThinkAndEatByKey, aFirstFork.key(), aSecondFork.key(), aIndex, aNumLoops, True, True)
else:
logging.info("EAT for %s" % aIndex)
logging.info("Dropping second fork for %s" % aIndex)
aSecondFork.Signal()
logging.info("Dropping first fork for %s" % aIndex)
aFirstFork.Signal()
if aNumLoops == 1:
logging.info("Finished looping, done.")
else:
logging.info("Ready to think again, deferring")
deferred.defer(
ThinkAndEat,
aFirstFork,
aSecondFork,
aIndex,
aNumLoops-1
)
</pre>
<br />
ThinkAndEat has three states. First, if we have neither Semaphore then aHasFirst and aHasSecond are false (I use First and Second instead of Left and Right for a bit of leeway later on). In this case, we wait on the first fork, and aHasFirst, aHasSecond will be True/False on the next call. This is the next case, where we then wait on the the second fork, and aHasFirst, aHasSecond will both be True on the third call. Finally, when they are both true, we have both forks. So we Eat (just log something, but this could be a lengthy op of some kind), then Signal, ie: drop, both forks.<br />
<br />
Finally, we reschedule ThinkAndEat again to complete the loop.<br />
<br />
You'll note in the second state I've added a sleep for 10 seconds. That is, between picking up the first fork and picking up the second, our philosophers think for a really long time. This doesn't change the theoretical behaviour of the algorithm, but in practice it makes it very easy to <a href="http://en.wikipedia.org/wiki/Deadlock">deadlock</a> especially on the first iteration; everyone will pick up the first fork, there'll be a pause, then everyone will try to pick up the second fork and have to wait indefinitely.<br />
<br />
Note that I use a countdown, aNumLoops, to stop this running forever. Eventually, in my house, even the Philosophers need to finish and go home!<br />
<br />
Now, to finish implementing the algorithm above, we need to create all the forks, then call ThinkAndEat for each philosopher, passing in the correct forks.<br />
<br />
<pre>def DiningPhilosphersFailTest():
lnumPhilosophers = 5
lnumLoops = 5 # number of think/eat loops</pre>
<pre> leta = datetime.datetime.utcnow() + datetime.timedelta(seconds=20)
lforks = []
lforkIndex = 0
while lforkIndex < lnumPhilosophers:
lfork = Fork.ConstructFork()
lfork.put()
lforks.append(lfork)
lforkIndex += 1
lphilosopherIndex = 0
while lphilosopherIndex < lnumPhilosophers:
deferred.defer(
ThinkAndEat,
lforks[lphilosopherIndex],
lforks[(lphilosopherIndex+1) % lnumPhilosophers],
lphilosopherIndex,
lnumLoops,
_eta = leta
)
lphilosopherIndex += 1
</pre>
<br />
This method sets up 5 philosophers with 5 forks, who will perform the ThinkAndEat loop 5 times. Philosopher i (i from 0 to 4) gets left fork i, right fork i+1, except for the philosopher 4, who gets left fork 4, right fork 0 (ie: a round table).<br />
<br />
When you run this method, it tends to deadlock every time. You can watch the default task list until the queued tasks goes to zero, then go to the datastore and run this query:<br />
<br />
select * from Fork where _counter = 0<br />
<br />
You should get a bunch of results; each one is a Fork (ie: Semaphore) which has had a successful Wait(), but no comparable Signal(). For this to be the case when no tasks are running requires that a set of philosophers (all of them in fact) have one fork and are waiting for another. Deadlock.<br />
<br />
Now to fix this, <a href="http://en.wikipedia.org/wiki/Edsger_W._Dijkstra">Dijkstra</a> (who originally posed this problem) proposed a strict ordering of resources (forks) along with a rule that philosophers acquire forks only in resource order. So, if we use the fork numbering above, and say we must acquire lower numbered forks before higher numbered, then we have the same algorithm *except* that philosopher 4 must now change to acquire his right fork first (fork 0) followed by his left fork (fork 4). <br />
<br />
We can achieve this simply by swapping forks, passing fork 0 as FirstFork and fork 4 as SecondFork. This is of course why I used First and Second rather than Left and Right.<br />
<br />
So here's the non-deadlocking solution:<br />
<br />
<pre>def DiningPhilosphersSucceedTest():
lnumPhilosophers = 5
lnumLoops = 5 # number of think/eat loops
leta = datetime.datetime.utcnow() + datetime.timedelta(seconds=20)
lforks = []
lforkIndex = 0
while lforkIndex < lnumPhilosophers:
lfork = Fork.ConstructFork()
lfork.put()
lforks.append(lfork)
lforkIndex += 1
lphilosopherIndex = 0
while lphilosopherIndex < lnumPhilosophers:
if lphilosopherIndex < lnumPhilosophers-1:
# not the last one
deferred.defer(
ThinkAndEat,
lforks[lphilosopherIndex],
lforks[lphilosopherIndex+1],
lphilosopherIndex,
lnumLoops,
_eta = leta
)
else:
# the last one
deferred.defer(
ThinkAndEat,
lforks[0],
lforks[lphilosopherIndex],
lphilosopherIndex,
lnumLoops,
_eta = leta
);
lphilosopherIndex += 1
</pre>
<br />
If you run this you wont get deadlock. You should find that when all tasks complete (the task queue is empty), "select * from Fork where _counter = 0" will give you zero objects.<br />
<div>
<br /></div>
You can safely run either of these solutions with fewer or much larger numbers of Philosophers, and more or less loops. To get deadlock with the first solution, you'll need to ensure tasks can run together. I find this takes two things; you need a large bucket size and replenishment rate on your default task queue, and you need to give it a decent number of loops, to give AppEngine time to spin up enough instances to run a lot of parallel tasks. To deadlock, you'll need to be able to run all Philosopher tasks concurrently; if they run sequentially they'll resolve out and fail to deadlock. Remember that the deadlocking algorithm isn't guaranteed to deadlock, that's only a possibility. The more philosophers, the harder it'll be to see it in practice (although you should see a lot of contention in the logs).<br />
<br />
One more note: You can't see this deadlock with the development appserver. Why not? Because the development server runs tasks sequentially. The Semaphores and Tasks will work, but in a very uninteresting one-task-at-a-time way.<br />
<br />
<b>Lifetime issues with Semaphores</b><br />
<br />
You'll notice that all these tests leave Semaphore objects lying around afterwards.<br />
<br />
Classically, semaphores are in-memory structures, and require memory management techniques for lifetime management. We have an analogous cleanup requirement for these Datastore backed Semaphores.<br />
<br />
You'll need some way to know when you are finished with them, so you can then delete them. For some uses it may be clear (eg: if they are always associated with another resource, then you create them when that resource is created and delete them when that resource is deleted). Other times, you may know that they should only live a certain length of time, so for example you could add a time-to-live or a delete-me-after field to them, and have a background process cleaning up every so often.<br />
<br />
<b>Very interesting, but why do I care? </b><br />
<br />
AppEngine lends itself to all kinds of uses where we are managing access to resources. Either we are managing something limited (with a semaphore with a counter equal to the number of resources available), or we are managing contention over something which can only be touched by one task (with a semaphore with counter=1, otherwise known as a Mutex).<br />
<br />
In my app <a href="http://my.syyn.cc/">Syyncc</a>, I used a dumb cron loop to move changes from one social network to another. Because changes move in multiple directions, and the central mechanism isn't concurrency safe, I needed to use a single processing loop to ensure I was only working on one thing at a time. It was a brain dead approach to controlling a concurrency issue.<br />
<br />
But it doesn't scale. It'll process 50 items at a time, once every 2 minutes. That's severely limited.<br />
<br />
Instead, I intend to model each user's combination of connected networks (fb, G+, etc) as a "hub", which is protected by a Mutex (Semaphore with Count=1). I can have separate tasks (<a href="http://appenginedevelopment.blogspot.com/2011/10/worker.html">Workers</a>) monitoring each network, and driving changes to the others, which don't communicate except that they share the hub Mutex, so only one can run at once. So, if there are changes in a user's G+ and Facebook at the same time, processing those changes will be concurrency safe, with the critical shared pieces protected.<br />
<br />
<b>Why not just use transactions?</b><br />
<br />
Because transactions only protect the datastore, and enqueuing of tasks.<br />
<br />
Resources under contention can be all kinds of things. In the case of Syyncc, they involve reading and writing the datastore *and* reading and writing social networks. A model of managing contention is better here than transactions (simply because the latter aren't available!)<br />
<br />
<b>And, I want to make a State Machine</b><br />
<br />
A project I've got coming up involves starting and stopping virtual machines in AWS. To do that well turns out to be involved, and really I want a full State Machine, where I can define a state graph, with nodes, conditions, transitions, ooh. And to build that, I'll need Semaphores.<br />
<br />
Expect to see an article on implementation of a Semaphore-based State Machine in the near future.<br />
<br />Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com9tag:blogger.com,1999:blog-4275632007165220704.post-67204911241223853082011-10-16T18:29:00.002+10:302011-10-16T22:57:33.310+10:30Not Your Mama's Web Server<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhc1vRXxFqG0IP9BSGSWvdMUA_35Ut1ODGMVPl4VNQjyjthlKDEhkFjU0df10NdWpCk-nnTU4tZ0g9_3oNCZLBezNE4LMxhV9Ea7vq6dlkvJEST5ga-ejOidq_F1UkTwAd4Vm-e31ZYPYA/s1600/yomama.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhc1vRXxFqG0IP9BSGSWvdMUA_35Ut1ODGMVPl4VNQjyjthlKDEhkFjU0df10NdWpCk-nnTU4tZ0g9_3oNCZLBezNE4LMxhV9Ea7vq6dlkvJEST5ga-ejOidq_F1UkTwAd4Vm-e31ZYPYA/s1600/yomama.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">It's a distributed machine, turkey. </td></tr>
</tbody></table>
<br />
AppEngine is a funny kind of beast. It's packaged up and presented more or less as a weird kind of futuristic virtual web host; sort of like a normal web host, but the plumbing's taken care of for you. Higher level than normal web hosts, kind of annoyingly proprietary, and it works sort of weirdly when you scratch the surface; the datastore isn't quite like a sql database, the web server's not quite like a proper web server, things are just a little queer.<br />
<br />
They're queer because it's not actually a webserver. Or, well, it is a webserver, but that's like calling a Tesla Roadster a horseless carriage. And then complaining that the reins are weird. (And making the SQL service is kind of like putting the reins in.)<br />
<br />
What it really is, is a new kind of computer (and by new I mean 40 year old computer science, but that's in no way a criticism). A distributed one. It's made out of lots of little physical ones, but they're mostly abstracted away. What we users get to touch is a high level machine where we can run some code in some well defined and interesting ways, and where that code can talk to some very interesting global resources (memcache, datastore, the webby bits of the internet, etc).<br />
<br />
What's really interesting to me is what is always interesting about distributed systems; massive concurrency. *Massive* concurrency. Concurrency on a scale where there aren't really specific limitations on how much stuff we can throw in in parallel; we seem to be limited largely by our wallets. And if you're a bit careful and pay a bit of attention, it's pretty cheap and that limitation is pretty minor. So that's fun.<br />
<br />
But this concurrency isn't the within-one-process, multi-threaded computing type of concurrency. This is old school, it feels a lot more like multi-processing, processes running in their own address spaces, maybe on separate physical hardware, at a great distance from one another. By distance here I mean that the inter-process communication mechanisms are many orders of magnitude slower than the code. These are not threads, that behave like people hanging out, chatting with one another. These are more like characters in a Dickens novel, on separate continents, sending letters by sea. Charming and old world, if a little inconvenient. Again, not a criticism, this is how distributed computing is.<br />
<br />
(Incidentally, I know that you can do multi-threaded programming on AppEngine, but I just think that's a bit boring and misses the point of the platform. Threads live inside one request, ie: process, and only live as long as that process lives. You can do some optimisation stuff with them, but they are at the wrong level of abstraction to be the real game here.)<br />
<br />
So what are the elements of this distributed machine?<br />
<br />
<b>Looking at AppEngine as a Distributed Machine</b><br />
<br />
Well, each user account has a bunch of program slots (10, or as many as you like for people with big wallets). Each slot, each "app", is the single-distributed-machine abstraction level. The slot/app/machine has its own resources, and lets you run one program.<br />
<br />
Each program has a bunch of potential entry points (web request handlers). They can be used to kick off copies of the program. That can happen because of user requests, or due to scheduled task calls, or cron jobs, or backend jobs (other?). In practice, the program, having multiple entry points, is functionally a set of cooperating programs, more or less divided up by entry point (ie: web request handler).<br />
<br />
Each time (a piece of) the program is kicked off, it gets its own address space. (Well, if you mark your programs as "threadsafe", then it might share address space with other copies of itself, but that's again an optimisation that can be ignored here.) But in any case, it's a program, loaded into an address space and being executed. We might call it a "process".<br />
<br />
The process can do things. One special purpose thing it might do, in response to a user request, is to construct and return a web page. But it can do all kinds of other things.<br />
<br />
Simplest among them is do to some calculation of something. Maybe you want to figure out a metric tonne of prime numbers? That sort of thing. In this case, you just do your calc and finish. Only thing to consider here is that there can be a limitation on the lifespan of a process (ie: 60 seconds, if you're using a frontend).<br />
<br />
Anything more complex than simple calculations will need to talk to things outside of the process. Even simple calcs presumably need to go somewhere when they are complete, so you can read the results. So we need interprocess communications mechanisms. What's available?<br />
<br />
The most obvious IPC mechanism is the datastore. We can put to the datastore, and get from the datastore. The datastore is persistent, so processes can talk to each other across time and space. The datastore is pretty slow compared to intra-process work (sea mail), although we can ameliorate that a little with memcache.<br />
<br />
Interestingly, we probably can't reliably use memcache for this on its own, because it can expire without warning. Actually, hold that thought, there might be interprocess comms jobs to which it is suited, I'll keep an eye out for them.<br />
<br />
The channels API could also potentially be used for interprocess communication. It's supposed to be for pushing event information to web clients, but a client could just as easily be a process inside this virtual machine. Then the channels would look like some pretty sweet message queues. So, keep that in mind too.<br />
<br />
But the datastore is the bread and butter here.<br />
<br />
Now there are roughly two ways of thinking about processes in this machine, two models; roughly, they are Long Lived and Short Lived.<br />
<br />
In the Long Lived model, processes live indefinitely, so they need to communicate directly with each other. This is where the channels API could be interesting, acting like one-way pipes between unix processes. Old school message passing. They might also communicate by polling datastore objects, waiting for things to happen. This will require using backends to accomplish, so is probably a bit pricey. It also feels a little clunky; lots of sitting in loops monitoring things.<br />
<br />
The Short Lived model is a bit more interesting, mainly because it feels more like the way the platform, the virtual distributed machine, wants to behave. In this model, the processes are short lived, and crucially never talk directly to each other. Instead, they all interact with the datastore. They are kicked off from somewhere (usually as Tasks, I'll get to that), and have arguments which, along with the entry point used, define what they are supposed to do. They read from the datastore to figure out more context, access shared resources, whatever. They perform calculations. They write back to the datastore. And they might queue up more processes (tasks) to run, as appropriate. So the processes behave as individual workers chipping away at the datastore, a shared medium, which is the coordinating point for all their work. Rather than talking directly to each other, the processes lay down cues, and notice cues layed down by others, doing their work based on those cues. It's a bit stigmergic.<br />
<br />
Using short lived processes means we can be event driven. Programs can be designed to only be doing work when there's something which needs doing; no processes are running if there is no work to do. Combining lots of event driven pieces together mean that, at a larger scale, we can approximate the process model of traditional computers, performing work when needed and blocking when waiting for results from external sources.<br />
<br />
To accomplish this, we first need an implementation of invokable, short lived processes. AppEngine Tasks fit this requirement. If we schedule a task to run now, we can give it arguments/context, it can run when there are resources available, it can do a short job and finish, possibly spawning more tasks. If you squint a little, this is making AppEngine's task scheduler look a little like an operating system's process scheduler.<br />
<br />
If we use multiple tasks in concert with one another, we can emulate blocking and waiting. If a task gets to a point where it needs to wait indefinitely on an external event, we can let it quite, and arrange it so that another task can pick up where it left off when the event occurs. So our Task/Process is a bit more limited than a normal machine process, but in aggregate tasks are as powerful as we need them to be.<br />
<br />
So in this context, we've got Processes which can do work, and which can potentially block & wait. The next thing we need is some communication primitives to let them coordinate effectively.<br />
<br />
<b>Semaphores</b><br />
<br />
Ok, now I make an admission: I still own books. Not for any real information value, mind you, but for sentimental reasons. I don't own many, but I have a little bookshelf in my study, and some of my textbooks from my computer science degree (long ago and far away) sit yellowing, waiting to be called back in to service. This is as good a day as any.<br />
<br />
Flicking through them, I've found "Principles of Concurrent and Distributed Programming", by Ben-Ari (edited by no less than C.A.R. Hoare). Flip, flip, ah here we are, Semaphores.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCGXoWvyW2fFQmQ6t8ORF2TyN0X0H9Pc-Rvg5uflBlWXePiV6zRB0owISlipz-24spy2_D-8vD_BUtVbgEB0o6YtxZMH6JU8b1QVwEjgvWpGOKyLqXBvHxGcAaiD-0usuYtVGia8oPlNQ/s1600/2011+-+2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCGXoWvyW2fFQmQ6t8ORF2TyN0X0H9Pc-Rvg5uflBlWXePiV6zRB0owISlipz-24spy2_D-8vD_BUtVbgEB0o6YtxZMH6JU8b1QVwEjgvWpGOKyLqXBvHxGcAaiD-0usuYtVGia8oPlNQ/s640/2011+-+2.jpg" width="640" /></a></div>
<br />
<br />
So can we implement a Semaphore appropriate to AppEngine?<br />
<br />
If we model the Semaphore as a datastore object, then it needs two methods, Wait and Signal. It also needs an internal integer count (the actual semaphore value), and a queue of waiting processes.<br />
<br />
First of all, Wait and Signal need to be atomic. How do we do that in the context of Datastore? With a transaction.<br />
<br />
<a href="http://code.google.com/appengine/docs/python/datastore/transactions.html">http://code.google.com/appengine/docs/python/datastore/transactions.html</a><br />
<br />
Code inside a normal transaction can work with objects in one entity group. In this case we're inside the limitations, because we only need to work with the object itself in order to implement atomicity of Wait and Signal.<br />
<br />
The one difference between a transaction and a traditional critical section (as required by Wait and Signal) is that the transaction can fail. The simplest way to make a transaction into a critical section is to put it inside a loop. Python-like pseudocode:<br />
<br />
<pre> while true:
try
perform transaction
break
catch transaction-fails:
# do nothing, we'll head around the loop again
#here we succeeded
</pre>
<pre></pre>
Now that could potentially starve. If this is in a front-end task, it'll time out eventually. We'll have to deal with timing out anyway, so let's leave it for that. If this is in a backend, under extreme contention it could just fail repeatedly forever. OTOH if we get into that situation, we're probably doing something spectacularly wrongly.<br />
<br />
Now the next difficult thing is that we don't have a way to suspend a task and wake it again, continuing where it left off, when done. But, we could make something a bit crappier that'd do the job.<br />
<br />
We could say that if you have a process that wants to wait on a semaphore, then it needs to be broken into two pieces. The first piece works up to where it waits on the semaphore. The second piece is run when it gets through wait, and must signal the semaphore at some point.<br />
<br />
So say this is our desired process code:<br />
<br />
<pre>EntryPoint:
semaphore = GetTheSemaphoreFromSomewhere # I'll get to this later
inconsequential-stuff-1
semaphore.wait()
do-something-with-shared-resource-protected-by-semaphore-2
semaphore.signal()
inconsequential-stuff-3
</pre>
<br />
Then that needs to be split to become<br />
<br />
<pre>
EntryPoint1:
semaphore = GetTheSemaphoreFromSomewhere
inconsequential-stuff-1
semaphore.wait(EntryPoint2) <- we tell the semaphore where we need to go next
EntryPoint2:
semaphore = GetTheSemaphoreFromSomewhere
do-something-with-shared-resource-protected-by-semaphore-2
semaphore.signal()
inconsequential-stuff-3
</pre>
<br />
Ok. Then how do we implement the semaphore? Something like this. Pythonish pseudocode again.<br />
<br />
<pre>class Semaphore(datastore object):
_counter = int property
_suspendList = entry point list property
def Wait(self, NextEntryPoint)
while true:
try
db.run_in_transaction(_doWait, self, NextEntryPoint)
break
catch transaction-fails:
# do nothing
def Signal(self)
while true:
try
db.run_in_transaction(_doSignal, self)
break
catch transaction-fails:
# do nothing
def _doWait(key, NextEntryPoint)
self = db.get(key)
if self._counter > 0:
self._counter -= 1
call(NextEntryPoint)
else:
self._suspendList.append(NextEntryPoint)
self.put()
def _doSignal(key)
self = db.get_by_key(key)
if len(self._suspendList) > 0:
NextEntryPoint = self._suspendList.remove()
call(NextEntryPoint)
else:
self._counter += 1
self.put()
</pre>
<pre></pre>
<div>
That looks easy, doesn't it?</div>
<div>
<br /></div>
<div>
I haven't explained what entry points are; they could just be the relative url for a request handler. Then, the entry point list property could just be a string list property, and call(NextEntryPoint) could simply do a task.add() for that entry point.<br />
<br />
Alternatively, entry points could be pickled function objects, kind of like what's used in deferred.defer. Actually that could be pretty sweet!</div>
<div>
<br /></div>
<div>
Also, where did we get the semaphore? Well, we probably want to be able to create them by name, and retrieve them by name. We could use that name as the entity key, or part of the entity key. Then you create a semaphore somewhere, and later retrieve it by looking it up by key (nice and quick).</div>
<div>
<br /></div>
<div>
---</div>
<div>
<br /></div>
<div>
oh man I have to stop.</div>
<div>
<br /></div>
<div>
In the next post, I'll include a proper semaphore implementation in python. </div>
<div>
<br /></div>
<div>
I'd better get it done soon though; I'm having some philosophers to dinner. </div>
<div>
<br /></div>Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-65854606104237035152011-10-15T14:43:00.002+10:302011-10-15T14:50:54.103+10:30Multi-threaded Python 2.7 WTFAQ?<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHhXIdWuqtZRCfr3AXPo_V-C11ug-Hkk1tDUsfs5dz4b5JdyBOJZO2YpdFIK9h3w1d4_clob6JJ_hugNyP3-m1nhm4ISm6EA193E6rqzYnHPNEgepOqdmwxItfFWFD2Vg4rOVyj9-6hnc/s1600/SerpentSensationNameContestHeldforTwo-HeadedSnake.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span class="Apple-style-span" style="font-family: inherit;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHhXIdWuqtZRCfr3AXPo_V-C11ug-Hkk1tDUsfs5dz4b5JdyBOJZO2YpdFIK9h3w1d4_clob6JJ_hugNyP3-m1nhm4ISm6EA193E6rqzYnHPNEgepOqdmwxItfFWFD2Vg4rOVyj9-6hnc/s320/SerpentSensationNameContestHeldforTwo-HeadedSnake.jpg" width="320" /></span></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span class="Apple-style-span" style="font-family: inherit; font-size: small;">No, I said multi-threaded... ah close enough.</span></td></tr>
</tbody></table>
<span class="Apple-style-span" style="background-color: white; font-family: inherit;"><br /></span><br />
<span class="Apple-style-span" style="background-color: white; font-family: inherit;">I'm just beginning my first experiments with python 2.7 apps, using "threadsafe: true". But I'm a clueless n00b as far as python goes. Well, not a n00b, but still a beginner. And then this multi-threading thing turns up, and I find myself groaning "oh man, really, does it have to get this complex?" I think I hear a lot of similar groans out<br />there ;-)</span><br />
<span class="Apple-style-span" style="background-color: white; font-family: inherit;"><br />I'm betting that the whole "multithreaded" thing in python appengine apps is scaring plenty of people. I've done a lot of concurrent programming, but the prospect of dealing with threading in python has daunted me a bit because I'm a beginner with python and appengine as it is - this just makes life harder. But hey, it's being added for a reason; I'd best quit complaining and start figuring it out!<br /><br />Thinking about threads and python, I realised that I didn't know how I needed to actually use multi-threading to make my apps leaner and meaner. I mean, why would I use them? They're for doing inherently concurrent things. Serving up pages isn't inherently concurrent stuff, at the app development level. What exactly is expected here? Shouldn't the framework be doing that kind of thing for me?<br /><br />And of course that was the aha moment. The framework *is* doing the work for me.<br /><br />The situation with python appengine development up until now has been that instances process serially. They take a request, see it through to its end. They take another request. And so on. That's cool, but instances spend a lot of time sitting around waiting when they could be doing more work.<br /><br />But with the new python 2.7 support, you can tell appengine that it would be ok to give instances more work when they are blocked waiting for something. eg: if they are doing a big url fetch, or a long query from datastore, something like that, then it's cool to give them another request to begin working on, and come back to the waiting request later when it's ready. You do that by setting "threadsafe: true" in your app.yaml .<br /><br />Being threadsafe sounds scary! But actually it shouldn't be a huge deal. Pretty much it's about what you shouldn't do, and largely you're probably not doing it anyway.</span><br />
<span class="Apple-style-span" style="background-color: white; font-family: inherit;"><br /></span><br />
<span class="Apple-style-span" style="font-family: inherit;"><b>The WTFAQ</b></span><br />
<span class="Apple-style-span" style="font-family: inherit;"><br /></span><br />
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">I drove off a cliff and was trapped in my car for the last couple of weeks, surviving on old sauce packets and some pickles that were on the floor. So I'm a bit out of the loop. WTFAQ are you talking about?</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">This: <a href="http://code.google.com/appengine/docs/python/python27/">http://code.google.com/appengine/docs/python/python27/</a></span></li>
</ul>
<li><span class="Apple-style-span" style="background-color: white; font-family: inherit;">Threads are when my socks are wearing out and there are dangly bits. Multithreading is when they are really worn out. Right? </span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;"><span class="Apple-style-span" style="background-color: white;">Multi-threading means having multiple points of execution on the one </span><span class="Apple-style-span" style="background-color: white;">codebase in the one address space. You can do some really cool stuff with threads. Or you can safely ignore them.</span></span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">But there is some minimal stuff I should be paying attention to, right?</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Yup. What you need to know is how to support Concurrent Requests.</span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Concurrent what now?</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Concurrent Requests means that your instances can serve multiple requests at a time, instead of just one at a time. You'll be paying for those instances. So this should be a bit cheaper.</span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">I like money.</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">yes, ok.</span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Ok, so what do I do to get these durn newfangled concurrent whatsits?</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">It's easy. Just follow these steps:</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">You'll be using Python 2.7</span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">To use Python 2.7 you have to use the High Replication Datastore. If your app has been around for a while (from before there was a choice of datastore type) then you might be using the Master/Slave datastore. If so, you need to migrate. If you think that's you, then read this:</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;"><a href="http://code.google.com/appengine/docs/python/datastore/hr/">http://code.google.com/appengine/docs/python/datastore/hr/</a></span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Read this, but don't let it freak you out:</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;"><a href="http://code.google.com/appengine/docs/python/python27/">http://code.google.com/appengine/docs/python/python27/</a></span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Also glance over the new Getting Started sample, it's a bit different.</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;"><a href="http://code.google.com/appengine/docs/python/gettingstartedpython27/">http://code.google.com/appengine/docs/python/gettingstartedpython27/</a></span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">If you got this far and haven't read any of the links above, congratulations. RTFM is for girly men (of all genders). </span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">Figure out if your app is going to be ok:</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Calls to memcache, datastore, other services of AppEngine, are fine.</span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">urlfetch and other httpish stuff (urllib, urllib2?) is fine.</span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">Normal code touching local variables is fine.</span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">Don't mess with instance memory (unless you know what you're doing). Mostly you can only use it for caching anyway; if you're not already doing that, don't worry about it. Basically, this means staying away from global variables. Multiple requests can come in and fiddle with those globals at the same time, Which Can Be Bad.</span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">Libraries included by AppEngine are fine, or else you'll get "don't use this" warnings. So don't worry too much here. But do check this link for changes to libraries with Python 2.7, some of that might be relevant to you.</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;"><a href="http://code.google.com/appengine/docs/python/python27/newin27.html">http://code.google.com/appengine/docs/python/python27/newin27.html</a></span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">You didn't read that, did you? You are Rock & Roll incarnate.</span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">Some of your third party libraries might be messing with global memory, and not be threadsafe. You know that shady date library you scored in a back alley on <a href="http://code.google.com/hosting">http://code.google.com/hosting</a>? That might be a problem. Read the code, ask around, or just give it a shot and flag the fact that it might blow up in your face.</span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Rewrite your main.py or equivalent to use <span class="Apple-style-span" style="background-color: white;">WSGI script handlers. That means it should look like this </span><span class="Apple-style-span" style="background-color: white;"><a href="http://code.google.com/appengine/docs/python/gettingstartedpython27/helloworld.html" style="color: #0658b5;" target="_blank">http://code.google.com/<wbr></wbr>appengine/docs/python/<wbr></wbr>gettingstartedpython27/<wbr></wbr>helloworld.html</a> and not like this </span><a href="http://code.google.com/appengine/docs/python/gettingstarted/usingwebapp.html">http://code.google.com/appengine/docs/python/gettingstarted/usingwebapp.html</a></span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">Set up your App.yaml properly; change "runtime: python" to "runtime: python27" and add "threadsafe: true". Like this:<br /><span class="Apple-style-span" style="background-color: white;"><pre class="prettyprint lang-yaml" style="background-color: #fafafa; border-bottom-color: rgb(187, 187, 187); border-bottom-style: solid; border-bottom-width: 1px; border-left-color: rgb(187, 187, 187); border-left-style: solid; border-left-width: 1px; border-right-color: rgb(187, 187, 187); border-right-style: solid; border-right-width: 1px; border-top-color: rgb(187, 187, 187); border-top-style: solid; border-top-width: 1px; color: #007000; line-height: 15px; margin-top: 1em; overflow-x: auto; overflow-y: auto; padding-bottom: 0.99em; padding-left: 0.99em; padding-right: 0.99em; padding-top: 0.99em; word-wrap: break-word;"><span class="pln" style="color: black;">application</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;"> helloworld
version</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;"> </span><span class="lit" style="color: #006666;">1</span><span class="pln" style="color: black;">
runtime</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;"> python27
api_version</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;"> </span><span class="lit" style="color: #006666;">1</span><span class="pln" style="color: black;">
threadsafe</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;"> </span><span class="kwd" style="color: #000088;">true</span><span class="pln" style="color: black;">
handlers</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;">
</span><span class="pun" style="color: #666600;">-</span><span class="pln" style="color: black;"> url</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;"> </span><span class="pun" style="color: #666600;">/.*</span><span class="pln" style="color: black;">
script</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;"> helloworld</span><span class="pun" style="color: #666600;">.</span><span class="pln" style="color: black;">app</span></pre>
</span></span></li>
<li><span class="Apple-style-span" style="font-family: inherit;">Make sure to get the latest appengine sdk; v1.5.5 or later. You can't actually run with threadsafe:true in the dev appserver yet, but you need at least this version or it'll refuse to upload.</span></li>
</ul>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">So I can't run this stuff on the dev appserver?</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Nope. Just set "threadsafe: false" when running locally. That's a bit annoying, but I'm sure it'll be sorted out soon.</span></li>
</ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Damn, that list of stuff is tl;dr. Do I <b>have</b> to do this?</span></li>
<ul>
<li><span class="Apple-style-span" style="font-family: inherit;">Nope. In fact, it's early days and you'll be heading into experimental land if you do it. If it's totally weirding you out and you have better things to do with your life, just ignore this whole thing for a bit. Eventually, way later on, it'll become properly supported, and then probably compulsory, but by then there'll be better guides, better understanding in the community, all that. It's totally fair to let the crazy nerds race out and crash on the new features, then skate in past the fallen bodies like Steven Bradbury: </span></li>
</ul>
</ul>
<div>
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.youtube.com/embed/-DHgMiN6Nlc?feature=player_embedded' frameborder='0'></iframe></div>
<span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: x-small;"><br /></span></div>Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com2tag:blogger.com,1999:blog-4275632007165220704.post-61029920752085456412011-10-12T23:17:00.001+10:302011-10-14T12:20:47.066+10:30Go Spiny Norman, Go!<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcvNBrlOFaYdPS6yK25B6NIeshr2dZLRUQd6uu5WzUkzXwOPqHXOrYcxsE2Ic4CjdLsBhJcC45jsQylr4m8ZlsrxwYAoTAWhaE-ENqLk7Vn1Kzq4FEIyuqMl-RVyLL4cmNVeJxnPOM2QI/s1600/spiny-norman-tick.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcvNBrlOFaYdPS6yK25B6NIeshr2dZLRUQd6uu5WzUkzXwOPqHXOrYcxsE2Ic4CjdLsBhJcC45jsQylr4m8ZlsrxwYAoTAWhaE-ENqLk7Vn1Kzq4FEIyuqMl-RVyLL4cmNVeJxnPOM2QI/s1600/spiny-norman-tick.jpg" /></a></div>
<br />
<i><b>Update 14 Oct 2011: </b>The green line from my graphs below has just appeared in the AppEngine admin console's Instances graph, as "Billing". Well actually it's the minimum of my green line and the blue "Total Instances" line, ie: it defines what I show as the pink area. Anyway, that's really, really useful, thanks to the AppEngine team from us developers!</i><br />
<i><b><br /></b></i><br />
<i><b>---</b></i><br />
<i><b><br /></b></i><br />
<i><b>tl;dr version:</b> You do *not* have to minimise your app's instance count in order to keep your appengine costs in an acceptable range. What you do have to do is to set Max Idle Instances to a low number. If you do that, your app can cost you about the same as it costs you now, possibly less. You don't need multithreading in order to achieve decent pricing. You will see advice like <span class="Apple-style-span" style="font-family: inherit;">this: "<a href="http://blorn.com/post/10013293300/the-unofficial-google-app-engine-price-change-faq">Forget about the scheduler. Turn on multithreading ASAP</a>".</span> That is wrong.</i><br />
<br />
----<br />
<br />
It looks like the <a href="http://appenginedevelopment.blogspot.com/2011/10/spiny-norman-test.html">Spiny Norman Test</a> was successful.<br />
<br />
Recall the hypothesis:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2553tJ-hT2059X3zHfbejxVrMR1OiFDkqjVQaY7W3iqwaicX1giFHJiLxV_nPRRETC_QuiynRHL-qw9x0nkVDO8uFWc3CV6eLoRNxE6ceySJ-S8E1NcBA6pgrN_iv_MGT2Ui3gx7l4Pk/s1600/billinghypothesis.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="109" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2553tJ-hT2059X3zHfbejxVrMR1OiFDkqjVQaY7W3iqwaicX1giFHJiLxV_nPRRETC_QuiynRHL-qw9x0nkVDO8uFWc3CV6eLoRNxE6ceySJ-S8E1NcBA6pgrN_iv_MGT2Ui3gx7l4Pk/s320/billinghypothesis.png" width="320" /></a></div>
<span class="Apple-style-span" style="background-color: white; color: #333333; font-family: 'Helvetica Neue Light', HelveticaNeue-Light, 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 19px;"><br /></span><br />
<span class="Apple-style-span" style="background-color: white; color: #333333; font-family: 'Helvetica Neue Light', HelveticaNeue-Light, 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 19px;">Hypothesis: Ignoring the 15 minute cost for spinning up new instances, the price we pay should be the pink area on the graph. That is, the moment by moment minimum of (total instances) and (active instances + Max Idle Instances). If Max Idle Instances is Automatic, then there is no green line, and we pay for the area under the blue line.</span><br />
<span class="Apple-style-span" style="background-color: white; color: #333333; font-family: 'Helvetica Neue Light', HelveticaNeue-Light, 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 19px;"><br />I proposed </span><br />
<span class="Apple-style-span" style="background-color: white; color: #333333; font-family: 'Helvetica Neue Light', HelveticaNeue-Light, 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 19px;">1 - First test that we pay for the area under the blue line when Max Idle Instances is Automatic.<br />2 - Next, test that we pay for the pink area when Max Idle Instances is set to something.</span><br />
<br />
I ran two tests. Firstly, I ran <a href="http://appenginedevelopment.blogspot.com/2011/10/spiny-norman-test.html">Spiny Norman</a> with Max Idle Instances set to Automatic. Secondly, I ran Spiny Norman with Max Idle Instances set to 2.<br />
<br />
<b>Test 1: Max Idle Instances set to Automatic</b><br />
<br />
Here's the instance graph for the day:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxOAcIdomUUDAlntNSwkw0L0_9mNp4KRo63ST5QYCXQgc56PmrqFzfWurModqLy6F93TX_uFL9hYQazRL-7DXt1pR__vSJHexPqJABE8qu2ttnSBzGXQ0cszLLOzZHixMn3htQdWYgmFc/s1600/spiny-instances-10102011.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxOAcIdomUUDAlntNSwkw0L0_9mNp4KRo63ST5QYCXQgc56PmrqFzfWurModqLy6F93TX_uFL9hYQazRL-7DXt1pR__vSJHexPqJABE8qu2ttnSBzGXQ0cszLLOzZHixMn3htQdWYgmFc/s640/spiny-instances-10102011.png" width="640" /></a></div>
<br />
<br />
The area under the blue line looks to be roughly 7 * 20 = 140 hours. The billing says:<br />
<br />
<span class="Apple-style-span" style="background-color: #f6f9ff; font-family: Arial, sans-serif; font-size: 13px;"></span><br />
<table cellpadding="0" cellspacing="0" class="ae-billing-usage-report" style="border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; empty-cells: show; font-size: 1em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; width: 531px;"><thead style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
<tr style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"><th style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: left; vertical-align: bottom;">Resource</th><th class="ae-currency-th" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; vertical-align: bottom;">Used</th><th class="ae-currency-th" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; vertical-align: bottom;">Free</th><th class="ae-currency-th" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; vertical-align: bottom;">Billable</th><th class="ae-currency-th" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; vertical-align: bottom;">Charge</th></tr>
</thead><tbody style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
<tr style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"><td style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em;"><strong>Frontend Instance Hours</strong><br />
$0.04/Hour</td><td class="ae-currency" style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; white-space: nowrap;">130.46</td><td class="ae-currency" style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; white-space: nowrap;">28.00</td><td class="ae-currency" style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; white-space: nowrap;">102.46</td><td class="ae-currency" style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; white-space: nowrap;">$4.10</td></tr>
</tbody></table>
<br />
Looks about right! So if you leave Max Idle Instances set to Automatic, you'll pay for the entire blue area.<br />
<br />
<b>Test 2: Max Idle Instances set to 2</b><br />
<br />
Here's the instance graph for the day of this test:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN0AzFI8ykshhBmpb32MKZGZroTKJojBc5VKhSWgHChVu73jm87kl99qrHVzYsumtlshN9B3O9k_jLKnUQPULstZLC9BDL56flN5nu5niurbSuTrNoVAywN9iX-lNIt7ac3YoHiwvHyaY/s1600/spiny-instances-11102011.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN0AzFI8ykshhBmpb32MKZGZroTKJojBc5VKhSWgHChVu73jm87kl99qrHVzYsumtlshN9B3O9k_jLKnUQPULstZLC9BDL56flN5nu5niurbSuTrNoVAywN9iX-lNIt7ac3YoHiwvHyaY/s640/spiny-instances-11102011.png" width="640" /></a></div>
<br />
That, believe it or not, is an identical run of Spiny Norman. Why the huge blowout of instances in the second part of the day? No idea, maybe the Max Idle Instances = 2 setting caused the scheduler to get upset? In any case, here we care about the yellow line, active instances, not the blue line. Here's a modified graph with the predicted "pink area" as above, for yellow line + 2:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbaaWYjJ77XidXecqA6LPB7hHD8HlhIc8Q_tTcQpRn31mwSRWWP6mq5_9OTyZ7K5Nx9feGrrwKowzwDjFbg3MJzDjm9SURYqHOFSLL2DIJoH-5X1pHTX4Skic9Bo1V2LaL1exsoBhkIVU/s1600/spiny-instances-with-pink-11102011.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbaaWYjJ77XidXecqA6LPB7hHD8HlhIc8Q_tTcQpRn31mwSRWWP6mq5_9OTyZ7K5Nx9feGrrwKowzwDjFbg3MJzDjm9SURYqHOFSLL2DIJoH-5X1pHTX4Skic9Bo1V2LaL1exsoBhkIVU/s640/spiny-instances-with-pink-11102011.png" width="640" /></a></div>
<br />
Spiny Norman uses negligible instance time (very low Active line), he just causes total (idle) instances to blow out. So it looks like that area is around, say, 0+2 instances * 20 hours? So approx 40 hours.<br />
<br />
And the billing says:<br />
<br />
<span class="Apple-style-span" style="background-color: #f6f9ff; font-family: Arial, sans-serif; font-size: 13px;"></span><br />
<table cellpadding="0" cellspacing="0" class="ae-billing-usage-report" style="border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; empty-cells: show; font-size: 1em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; width: 531px;"><thead style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
<tr style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"><th style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: left; vertical-align: bottom;">Resource</th><th class="ae-currency-th" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; vertical-align: bottom;">Used</th><th class="ae-currency-th" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; vertical-align: bottom;">Free</th><th class="ae-currency-th" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; vertical-align: bottom;">Billable</th><th class="ae-currency-th" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; color: #666666; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; vertical-align: bottom;">Charge</th></tr>
</thead><tbody style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
<tr style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"><td style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em;"><strong>Frontend Instance Hours</strong><br />
$0.04/Hour</td><td class="ae-currency" style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; white-space: nowrap;">44.90</td><td class="ae-currency" style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; white-space: nowrap;">28.00</td><td class="ae-currency" style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; white-space: nowrap;">16.90</td><td class="ae-currency" style="background-color: transparent; border-bottom-color: rgb(221, 221, 221); border-bottom-style: solid; border-bottom-width: 1px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-color: initial; border-top-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4em; padding-left: 0px; padding-right: 0px; padding-top: 0.4em; text-align: right; white-space: nowrap;">$0.68</td></tr>
</tbody></table>
<br />
That's about right!<br />
<br />
<b>Conclusion</b><br />
<br />
I could do some more in depth tests, changing the behaviour of Spiny Norman over the course of a day, playing with the Max Idle Instances setting, but I think these two tests show the state of play pretty adequately. Hypothesis supported.<br />
<br />
So what this means, especially for those of us watching our pennies, is that even though we can't stop the scheduler kicking off massive amounts of instances, we can control whether we pay for them or not. Make sure that you set Max Idle Instances to a fixed (low!) number. For my self funded projects I'll be setting it to 1, and that'll do.<br />
<br />
Leaving Max Idle Instances on Automatic, the default, is a mistake you'll regret very, very quickly.<br />
<br />
Of course the billing rules will probably change tomorrow. Ah well.Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-52513928236468777782011-10-10T13:24:00.000+10:302011-10-12T16:00:18.159+10:30An Embarrassment of Riches<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWzBm1aQKFohMvSkOHIzdqOl3fDcj9Baqt9TcIfL-8SmxvF9FVtqcGChxTe8Fgn2kTVYAAJsjU4U5QEl9P4ZN01mq_CGZEYjtraIVcnjWYR72TUPdKd58qrHiA_NRmsldEtj6BILYNHi0/s1600/riches.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWzBm1aQKFohMvSkOHIzdqOl3fDcj9Baqt9TcIfL-8SmxvF9FVtqcGChxTe8Fgn2kTVYAAJsjU4U5QEl9P4ZN01mq_CGZEYjtraIVcnjWYR72TUPdKd58qrHiA_NRmsldEtj6BILYNHi0/s320/riches.png" width="320" /></a></div>
<br />
<i>Update: Details on new stuff in Python 2.7 for appengine here: </i><a href="http://code.google.com/appengine/docs/python/python27/newin27.html">http://code.google.com/appengine/docs/python/python27/newin27.html</a><br />
<br />
<i>AppEngine is all colour and movement at the moment.</i><br />
<i><br /></i><br />
<ul>
<li><i>MySQL compatible db layer, Google Cloud SQL. </i></li>
<li><i>Python 2.7 now available as an experimental runtime for all apps. </i></li>
<li><i>Cross group (XG) transactions. </i></li>
<li><i>Increased limits for all kinds of cool stuff.</i></li>
</ul>
<i><br /></i><br />
<i>First, Cloud SQL is released. Yo Dawg, we heard you like databases, so we put a database in your database so you can query while you query. Or, more clearly, this:</i><br />
<br />
<span class="Apple-style-span" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 13px;"></span><br />
<h2 class="date-header" style="color: #666666; font: normal normal normal 95%/normal Arial, sans-serif; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">
Thursday, October 6, 2011</h2>
<div class="date-posts">
<div class="post-outer">
<div class="post hentry" style="border-bottom-color: rgb(204, 204, 204); border-bottom-style: dotted; border-bottom-width: 1px; margin-bottom: 1.5em; margin-left: 0px; margin-right: 0px; margin-top: 0.5em; padding-bottom: 1.5em; padding-right: 1.5em;">
<a href="http://www.blogger.com/blogger.g?blogID=4275632007165220704" name="5173012821722895771"></a><br />
<h3 class="post-title entry-title" style="color: #666666; font-size: 18px; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0.25em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
<a href="http://googleappengine.blogspot.com/2011/10/google-cloud-sql-your-database-in-cloud.html" style="color: #666666; display: block; text-decoration: none;">Google Cloud SQL: Your database in the cloud</a></h3>
<div class="post-header-line-1">
</div>
<div class="post-body entry-content">
<em>Cross-posted from the <a href="http://googlecode.blogspot.com/2011/10/google-cloud-sql-your-database-in-cloud.html">Google Code Blog</a></em><br />
<br />
One of App Engine’s most requested features has been a simple way to develop traditional database-driven applications. In response to your feedback, we’re happy to announce the limited preview of <a href="http://code.google.com/apis/sql/">Google Cloud SQL</a>. You can now choose to power your App Engine applications with a familiar relational database in a fully-managed cloud environment. This allows you to focus on developing your applications and services, free from the chores of managing, maintaining and administering relational databases. Google Cloud SQL brings many benefits to the App Engine community:<br />
<ul>
<li>No maintenance or administration - we manage the database for you.</li>
<li>High reliability and availability - your data is replicated synchronously to multiple data centers. Machine, rack and data center failures are handled automatically to minimize end-user impact.</li>
<li>Familiar <a href="http://en.wikipedia.org/wiki/MySQL">MySQL</a> database environment with <a href="http://en.wikipedia.org/wiki/Java_Database_Connectivity">JDBC</a> support (for Java-based App Engine applications) and <a href="http://wiki.python.org/moin/DatabaseProgramming/">DB-API</a> support (for Python-based App Engine applications).</li>
<li>Comprehensive user interface for administering databases.</li>
<li>Simple and powerful integration with <a href="http://code.google.com/appengine/">Google App Engine</a>.</li>
</ul>
The service includes database import and export functionality, so you can move your existing MySQL databases to the cloud and use them with App Engine. Cloud SQL is available free of charge for now, and we will publish pricing at least 30 days before charging for it. The service will continue to evolve as we work out the kinks during the preview, but <a href="http://goo.gl/jVq1m">let us know</a> if you’d like to take it for a spin. </div>
</div>
</div>
</div>
<i><br /></i><br />
<i>Here's the FAQ: </i><a href="http://code.google.com/apis/sql/faq.html">http://code.google.com/apis/sql/faq.html</a><br />
<i>And the Group: </i><a href="https://groups.google.com/forum/#!forum/google-cloud-sql-discuss">https://groups.google.com/forum/#!forum/google-cloud-sql-discuss</a><i> </i><br />
<i><br /></i><br />
<i>The signup form implied that we'll be able to mix and match SQL and datastore, which is damned fine.</i><br />
<i><br /></i><br />
<i>Then, the prerelease of the new SDK is announced by Ikai Lan:</i><br />
<br />
<span class="Apple-style-span" style="background-color: white; font-family: arial, sans-serif; font-size: 13px;">Hey everyone,</span><br />
<div>
<br /></div>
<div>
Prerelease SDK 1.5.5 is now available for download! You can get it here:</div>
<div>
<br /></div>
<div>
Python:</div>
<div>
<a href="http://code.google.com/p/googleappengine/downloads/detail?name=google_appengine_prerelease-1.5.5.zip" style="color: #0658b5;" target="_blank">http://code.google.com/p/<wbr></wbr>googleappengine/downloads/<wbr></wbr>detail?name=google_appengine_<wbr></wbr>prerelease-1.5.5.zip</a></div>
<div>
<br /></div>
<div>
Java:</div>
<div>
<a href="http://code.google.com/p/googleappengine/downloads/detail?name=appengine-java-sdk-prerelease-1.5.5.zip" style="color: #0658b5;" target="_blank">http://code.google.com/p/<wbr></wbr>googleappengine/downloads/<wbr></wbr>detail?name=appengine-java-<wbr></wbr>sdk-prerelease-1.5.5.zip</a></div>
<div>
<br /></div>
<div>
We provide prerelease SDKs as previews for things to come. New features should not work in production yet, and documentation is typically still a work in progress. Release notes are below as well as in the prerelease packages:</div>
<div>
<br /></div>
<div>
<div>
Python</div>
<div>
==============================<wbr></wbr>=</div>
<div>
- Python 2.7 is now available as an experimental runtime for all applications</div>
<div>
using the High Replication Datastore. To upload your app to the Python 2.7</div>
<div>
runtime, change the runtime argument in your app.yaml to python27.</div>
<div>
- We have released an experimental utility, available in the Admin Console, to</div>
<div>
assist in migrating your application to the High Replication datastore. This</div>
<div>
utility allows you to copy the bulk of your data in the background, while the</div>
<div>
source application is still serving. You then need a brief read-only period to</div>
<div>
migrate your application data while you copy the data that has changed from</div>
<div>
the time the original copy started.</div>
<div>
- We have increased the number of files you can upload with your application</div>
<div>
from 3,000 to 10,000.</div>
<div>
- We have increased the size limit for a single file uploaded to App Engine from</div>
<div>
10MB to 32MB.</div>
<div>
- We have increased the Frontend request deadline from 30 seconds to 60 seconds.</div>
<div>
- We have increased the URLFetch maximum deadline from 10 seconds to 60 seconds.</div>
<div>
- We have increased the URLFetch Post payload from 1MB to 5MB.</div>
<div>
- App Engine now supports Cross Group (XG) transactions with the High</div>
<div>
Replication Datastore, which allow you to perform transactions across</div>
<div>
multiple entity groups.</div>
<div>
- We have released an experimental API that can write to Google Storage for</div>
<div>
Developers directly from App Engine.</div>
<div>
- We have added a graph to the admin console that displays the number of</div>
<div>
instances for which you will be billed.</div>
<div>
- In the XMPP API, get_presence() is deprecated in favor of using the inbound</div>
<div>
presence handlers documented in</div>
<div>
<a href="http://code.google.com/appengine/docs/python/xmpp/overview.html#Handling_User_Presence" style="color: #0658b5;" target="_blank">http://code.google.com/<wbr></wbr>appengine/docs/python/xmpp/<wbr></wbr>overview.html#Handling_User_<wbr></wbr>Presence</a>.</div>
<div>
- The Task Queue API 'target' parameter now accepts a new value,</div>
<div>
taskqueue.DEFAULT_APP_VERSION, which will send the task to the default</div>
<div>
frontend version, rather than the version or backend where the 'add' method is</div>
<div>
being called.</div>
<div>
- In the URLFetch API, make_fetch_call() now returns an RPC object.</div>
<div>
- Fixed an issue in the Admin Console where the "Run Now" button did not work</div>
<div>
for tasks with a '-' in the name.</div>
<div>
- Fixed an issue where the SDK did not decode Base64 encoded blobs.</div>
<div>
- Fixed an issue to provide a better error message when using the Mail API to</div>
<div>
send email to an invalid user address.</div>
<div>
- Fixed an issue in the SDK where a skip_files entry caused an ImportError when</div>
<div>
the library was located elsewhere in the PYTHONPATH.</div>
<div>
- Fixed an issue in the SDK index viewer where the arrows indicating whether a</div>
<div>
query was ascending or descending were not properly rendered.</div>
<div>
- Fixed an issue where httplib did not support the deadline argument for</div>
<div>
URLFetch calls.</div>
<div>
<a href="http://code.google.com/p/googleappengine/issues/detail?id=2216" style="color: #0658b5;" target="_blank">http://code.google.com/p/<wbr></wbr>googleappengine/issues/detail?<wbr></wbr>id=2216</a></div>
<div>
- Fixed an issue where you could not schedule a cron job to run every 100</div>
<div>
minutes.</div>
<div>
<a href="http://code.google.com/p/googleappengine/issues/detail?id=5243" style="color: #0658b5;" target="_blank">http://code.google.com/p/<wbr></wbr>googleappengine/issues/detail?<wbr></wbr>id=5243</a></div>
<div>
- Fixed an issue in the SDK where failed tasks retried immediately instead of</div>
<div>
waiting for 30 seconds.</div>
<div>
<a href="http://code.google.com/p/googleappengine/issues/detail?id=5587" style="color: #0658b5;" target="_blank">http://code.google.com/p/<wbr></wbr>googleappengine/issues/detail?<wbr></wbr>id=5587</a></div>
<div>
- Fixed an issue making it possible to modify request headers using the deferred</div>
<div>
library.</div>
<div>
<a href="http://code.google.com/p/googleappengine/issues/detail?id=5861" style="color: #0658b5;" target="_blank">http://code.google.com/p/<wbr></wbr>googleappengine/issues/detail?<wbr></wbr>id=5861</a></div>
</div>
<div>
<br /></div>
<div>
Java</div>
<div>
<div>
=============</div>
<div>
- We have released an experimental utility, available in the Admin Console, to</div>
<div>
assist in migrating your application to the High Replication datastore. This</div>
<div>
utility allows you to copy the bulk of your data in the background, while the</div>
<div>
source application is still serving. You then need to take a short downtime to</div>
<div>
migrate your application data while you copy the data that has changed from</div>
<div>
the time the original copy started.</div>
<div>
- We have increased the number of files you can upload with your application to</div>
<div>
from 3,000 to 10,000.</div>
<div>
- We have increased the size limit for a single file uploaded to App Engine from</div>
<div>
10MB to 32MB.</div>
<div>
- We have increased the Frontend request deadline from 30 seconds to 60 seconds.</div>
<div>
- We have increased the URLFetch maximum deadline from 10 seconds to 60 seconds.</div>
<div>
- We have increased the URLFetch Post payload from 1MB to 5MB.</div>
<div>
- App Engine now supports Cross Group (XG) transactions with the High</div>
<div>
Replication Datastore, which allow you to perform transactions across multiple</div>
<div>
entity groups.</div>
<div>
- We have released an experimental API that can write to Google Storage for</div>
<div>
Developers directly from App Engine.</div>
<div>
- We have added a graph to the admin console that displays the number of</div>
<div>
instances for which you will be billed.</div>
<div>
- In the XMPP API, getPresence() is deprecated in favor of using the inbound</div>
<div>
presence handlers documented in</div>
<div>
<a href="http://code.google.com/appengine/docs/java/xmpp/overview.html#Handling_User_Presence" style="color: #0658b5;" target="_blank">http://code.google.com/<wbr></wbr>appengine/docs/java/xmpp/<wbr></wbr>overview.html#Handling_User_<wbr></wbr>Presence</a>.</div>
<div>
- Fixed an issue in the Admin Console where the "Run Now" button did not work</div>
<div>
for tasks with a '-' in the name.</div>
<div>
- Fixed an issue to provide a better error message when a user tries to parse an</div>
<div>
HttpRequest's input stream more than once in a request.</div>
<div>
- Fixed an issue to provide a better error message when using the Mail API to</div>
<div>
send email to an invalid user address.</div>
<div>
- Fixed an issue in the SDK where HttpServletRequest.<wbr></wbr>getInputStream().read()</div>
<div>
always returned -1.</div>
<div>
<a href="http://code.google.com/p/googleappengine/issues/detail?id=5396" style="color: #0658b5;" target="_blank">http://code.google.com/p/<wbr></wbr>googleappengine/issues/detail?<wbr></wbr>id=5396</a></div>
<div>
- Fixed an issue where you could not schedule a cron job to run every 100</div>
<div>
minutes.</div>
<div>
<a href="http://code.google.com/p/googleappengine/issues/detail?id=5861" style="color: #0658b5;" target="_blank">http://code.google.com/p/<wbr></wbr>googleappengine/issues/detail?<wbr></wbr>id=5861</a></div>
</div>
<div>
<br /></div>
<div>
<br />
<div>
<br />
--</div>
<div>
Ikai Lan<br />
Developer Programs Engineer, Google App Engine</div>
<div>
<a href="http://plus.ikailan.com/" style="color: #0658b5;" target="_blank">plus.ikailan.com</a> | <a href="http://twitter.com/ikai" style="color: #0658b5;" target="_blank">twitter.com/ikai</a></div>
</div>
<br />
<br />
<br />Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-7692978045787491012011-10-10T00:08:00.001+10:302011-10-12T23:28:03.266+10:30The Spiny Norman Test<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlcMciUUGFT9VrrV8wRJexzXfwMoqPmz5jTkoCg4S52yJznPBGC3XMKwfHil4Vt27bhnVXMPVdp5IZkwz3_mUXxyxfP1eZ9UTcf-lsq7HIKW-NLKtw0tlhggMLmCe-Uhv9easyTs7sM8c/s1600/spiny-norman-59306.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlcMciUUGFT9VrrV8wRJexzXfwMoqPmz5jTkoCg4S52yJznPBGC3XMKwfHil4Vt27bhnVXMPVdp5IZkwz3_mUXxyxfP1eZ9UTcf-lsq7HIKW-NLKtw0tlhggMLmCe-Uhv9easyTs7sM8c/s1600/spiny-norman-59306.jpg" /></a></div>
<br />
<i>Update: Results are in, see <a href="http://appenginedevelopment.blogspot.com/2011/10/go-spiny-norman-go.html">Go Spiny Norman, Go</a>.</i><br />
<br />
Previously, in <a href="http://appenginedevelopment.blogspot.com/2011/09/amazing-story-of-appengine-and-two.html">The Amazing Story of AppEngine and the Two Orders Of Magnitude</a>, I've written about minimizing the cost of instances in the new AppEngine billing regime. But I think I made a mistake, and I think many people are making the same mistake.<br />
<br />
Here's one of the graphs that I showed of instance usage from my appengine app <a href="http://my.syyn.cc/">Syyncc</a>:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://point7.files.wordpress.com/2011/09/gaeinstancesgraph7sep2011.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="109" src="http://point7.files.wordpress.com/2011/09/gaeinstancesgraph7sep2011.png" width="320" /></a></div>
<br />
My posts were largely about trying to drop the blue line down (that's "Total" instances), and I largely ignored the yellow line, "Active" instances.<br />
<br />
Now to get that blue line down, I did two things. I first set Max Idle Instances to 1, from Automatic. That is detailed <a href="https://point7.wordpress.com/2011/09/04/appengine-tuning-1/">here</a>, and was successful in dropping the blue line down. Next, I changed my app's task behaviour, from kicking off 50 tasks every 2 mins, to smoothing those out, scheduling one every two seconds.<br />
<br />
<a href="https://point7.wordpress.com/2011/09/10/appengine-tuning-schlemiel-youre-fired/">Once I got my billing results</a>, these changes made a huge impact. But, the numbers were puzzling. Firstly, they were too low (which I just accepted happily, as these numbers represent money in my pocket). Secondly, it appeared that all the benefit was seen based on the first change (Max Idle Instances), with no change from the smoothing out of tasks. That's been bugging me.<br />
<br />
And then on the AppEngine list, Gerald Tan made this comment:<br />
<span class="Apple-style-span" style="background-color: white; font-family: arial, sans-serif; font-size: 13px;"><br /></span><br />
<span class="Apple-style-span" style="background-color: white; font-family: arial, sans-serif; font-size: 13px;">The reason why your Frontend Instance hours are lower than you expected is because you assumed that you will be billed for the area under the BLUE line in the Instance graph. It's not. You are being billed for the area under the YELLOW line (Active Instance) PLUS your Max Idle Instance setting. So your Active Instances is hovering at around ~0.72, and I assume you have set your application's Max Idle Instance to 1. Therefore ~1.72 * 24 = ~41.28 Instance Hours</span><br />
<br />
Oh really?? That would match the data, very cool. And why were the numbers so high before I set the Max Idle Instances to 1?<br />
<br />
This <a href="http://code.google.com/appengine/kb/postpreviewpricing.html">Post-Preview Pricing FAQ</a> (should have been called a Primer for the alliteration) says some unclear things. We have this:<br />
<br />
"<span class="Apple-style-span" style="background-color: white; font-family: Helvetica, Arial, sans-serif; font-size: x-small; line-height: 16px;">Instances are charged for their uptime in addition to a 15-minute startup fee, the startup fee covers what it takes for App Engine to bring up and down the instance. So, if you have an on-demand instance only serving traffic for 5 minutes, you will pay for 5+15 minutes, or $0.08 / 60 * 20 = 2.6 cents. Additionally, if the instance stops and then starts again within a 15 minute window, the startup fee will only be charged once and the instance will be considered "up" for the time that passed. For example, if an on-demand instance is serving traffic for 5 min, is then down for 4 minutes and then serving traffic for 3 more minutes, you will pay for (5+4+3)+15 minutes, or $0.08 / 60 * 27 = 3.6 cents."</span><br />
<span class="Apple-style-span" style="font-family: Helvetica, Arial, sans-serif; font-size: x-small;"><span class="Apple-style-span" style="line-height: 16px;"><br /></span></span><br />
On the other hand, this:<br />
<br />
<span class="Apple-style-span" style="background-color: white; font-family: Helvetica, Arial, sans-serif; font-size: x-small; line-height: 16px;"><b>Max Idle Instances</b>: Decreasing this value will likely decrease your bill as fewer idle instances will typically be running and we will not charge for any excessive idle instances. In this case the scheduler knob is a suggestion to the scheduler but we will not charge you for excess if the scheduler ignores the suggestion. For instance, if you set Max Idle Instances to 5 and the scheduler leaves 16 instances up for some length of time, you will only be charged for 5 instances.</span><br />
<span class="Apple-style-span" style="background-color: white; font-family: Helvetica, Arial, sans-serif; font-size: x-small; line-height: 16px;"><br /></span><br />
So, I think this might mean the following:<br />
<br />
If you set "Max Idle Instances" to Automatic (the default setting), that means you are letting the scheduler spend your money. It'll keep as many instances running at any time as it thinks you need, and you'll pay for all of them (plus that nasty 15 minute bonus on starting up extras). This means, you pay for the area under the blue line.<br />
<br />
If you set "Max Idle Instances" to a specific value, you'll pay for your active instance time plus your "Max Idle Instances" setting, or your Total instance time, whichever is less. ie: you pay for the minimum of (area under yellow line + Max Idle Instances) and (area under blue line).<br />
<br />
So setting Max Idle Instances to an actual number is a good idea. The lower you set it, the more it might affect the scheduler's decisions, but still, to minimise cost, set it to a finite number.<br />
<br />
Great conjectures. But then, the old lady in my head (oh god she's really in there) says this:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjt3FZJ1cPvvlOFu6P3-AMgEzGlrfjK2r2dv92lHCgXGhnfvY388J8UJ8XDjISy9JMUmQWK-we1QizQWRHlOKyWgnprUJX-53N-44ck8vfwk2cbU3SoXvdOcKQK1e4q_YwpM_LMyFL2YVs/s1600/testit.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="202" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjt3FZJ1cPvvlOFu6P3-AMgEzGlrfjK2r2dv92lHCgXGhnfvY388J8UJ8XDjISy9JMUmQWK-we1QizQWRHlOKyWgnprUJX-53N-44ck8vfwk2cbU3SoXvdOcKQK1e4q_YwpM_LMyFL2YVs/s320/testit.png" width="320" /></a></div>
<br />
<div style="text-align: center;">
<span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-size: x-large;">TEST IT!</span></div>
<br />
Ok old lady, I'll test it. razza frazza rackkin testin frazza razza....<br />
<br />
---<br />
<br />
Ok, so first we need an hypothesis. Put a new line on the graph, a green line, which is the yellow line, raised up by the setting of Max Idle Instances. If Max Idle Instances is 3, it'll look like this:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4VMS7W-N1e7mnSWJWEoXA9aEDDiRj3z1Hypy-G5wKE5Vv8-jCjCfrA54I5Tu_dOJv-YZufUsS7-WM7sFUnyBcGi7-ZYr_rA3Wygav-XOnPTAI1kNiClFN2L9XhwIVc6kXXe8XQR7VPpY/s1600/billinghypothesis.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="109" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4VMS7W-N1e7mnSWJWEoXA9aEDDiRj3z1Hypy-G5wKE5Vv8-jCjCfrA54I5Tu_dOJv-YZufUsS7-WM7sFUnyBcGi7-ZYr_rA3Wygav-XOnPTAI1kNiClFN2L9XhwIVc6kXXe8XQR7VPpY/s320/billinghypothesis.png" width="320" /></a></div>
<br />
The pink area is the intersection of the area under the blue line and the area under the green line.<br />
<br />
Hypothesis: Ignoring the 15 minute cost for spinning up new instances, the price we pay should be the pink area on the graph. That is, the moment by moment minimum of (total instances) and (active instances + Max Idle Instances). If Max Idle Instances is Automatic, then there is no green line, and we pay for the area under the blue line.<br />
<br />
So how do we test that hypothesis?<br />
<br />
1 - First test that we pay for the area under the blue line when Max Idle Instances is Automatic.<br />
2 - Next, test that we pay for the pink area when Max Idle Instances is set to something.<br />
<br />
To get a good test here, we want to create an instance usage profile where the blue line and the yellow line are disparate. My best guess for how to do this is to create some spiky usage, that should leave too many instances running most of the time.<br />
<br />
<b>Enter Spiny Norman!</b><br />
<br />
Spiny Norman is a <a href="http://appenginedevelopment.blogspot.com/2011/10/worker.html">Worker</a> class, designed to do one thing; cause AppEngine to experience very bursty load.<br />
<br />
<pre>import logging
from Worker import Worker
from datetime import timedelta
from google.appengine.ext import db
class SpinyNorman(Worker):
_minutesBetweenSpines = 12
_spineWidth = 1000000
_numberOfSpinesRemaining = db.IntegerProperty()
def CreateSpines(cls, aSpineLength, aNumberOfSpines):
lcount = 0
while lcount < aSpineLength:
lnorman = SpinyNorman()
lnorman._numberOfSpinesRemaining = aNumberOfSpines
lnorman.enabled = True
lnorman.put()
lcount += 1
CreateSpines = classmethod(CreateSpines)
def doExecute(self):
self._numberOfSpinesRemaining -= 1
#
lcount = 0
while lcount < self._spineWidth:
lcount += 1
logging.debug(lcount)
def doCalculateNextRun(self, aUtcNow, alastDue):
if self._numberOfSpinesRemaining > 0:
if alastDue:
return alastDue + timedelta(minutes=self._minutesBetweenSpines)
else:
return aUtcNow + timedelta(minutes=self._minutesBetweenSpines)
else:
return None # time to stop
</pre>
Spiny Norman creates a spiny workload, as follows:<br />
<br />
Each "Spine" is a set of tasks running (doExecute()) at the same time. The length of the spine is the number of tasks. The width of the spine (in time) is a measure of how much work the spine will do (how long it'll work for). Spines are set apart from each other in time, which is the minutes between spines. There are a fixed number of spines.<br />
<br />
You kick off Spiny Norman by calling SpinyNorman.CreateSpines(spineLength, numberOfSpines) . That creates a number of instances of Spiny Norman equal to the spineLength, and sets the countdown for how many iterations they should continue for (numberOfSpines). _spineWidth is the number of times to sit in a busy loop in doExecute. _minutesBetweenSpines is used to calculate the next run time in doCalculateNextRun.<br />
<br />
I'm using a spine length of 250 (that is, 250 tasks), a spine width of 1,000,000 (enough load to notice some work being done, a few second's worth), 12 minutes between spines and 100 spines total (ie Spiny Norman runs for about 1200 minutes, or 20 hours, total).<br />
<br />
I've set up a new AppEngine instance, I've enabled billing, and I've kicked off Spiny Norman to run during his own billing day. I've left Max Idle Instances set to Automatic. We should see a huge difference between the blue and yellow instance lines, and the billing should tell us which one I'm paying for, which will test part 1 of the hypothesis.<br />
<br />
In the next post I'll publish the result of this test, and I'll kick off the next test. Stay tuned!Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-61732189699916107022011-10-01T17:54:00.000+09:302013-01-04T13:46:46.838+10:30The Worker<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXXwKaxFFazIznVoGKTC5rr20pKz1C1BhHOAnX01miOQBc68HDx0UjhRsgz3Ua9ufCYtUbo_EUzg8M4hQA6l4RR9l1-mOU-z84sY5xca7c_Y_d2W00Vfvno0yxyef1gZT0SXOmvzfHvAM/s1600/peasant01.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXXwKaxFFazIznVoGKTC5rr20pKz1C1BhHOAnX01miOQBc68HDx0UjhRsgz3Ua9ufCYtUbo_EUzg8M4hQA6l4RR9l1-mOU-z84sY5xca7c_Y_d2W00Vfvno0yxyef1gZT0SXOmvzfHvAM/s1600/peasant01.jpg" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">More Work?</td></tr>
</tbody></table>
<br />
One of the first serious <a href="http://appengine.google.com/">Google AppEngine</a> subjects I've approached recently is the problem of doing work in the background. In my particular case I needed to do some intensive and error prone tasks, then send an email with the results (which is also error prone), on a schedule.<br />
<br />
I was going to write some standard job-processing-in-a-loop kind of code, with the loop being processed as a cron job (set up in cron.yaml). That's what <a href="http://my.syyn.cc/">Syyncc</a> does. But some bit of my brain kept grumbling about the inelegance of that approach. You're on a platform that wants to do it a different way, says my brain (and who am I to disagree?).<br />
<br />
And the cron thing is kind of bad, because it doesn't scale. Let's say I schedule a job every two minutes. It can get through some fixed amount of work (maybe 10 jobs?) before it hits its time limit. It can never do more than that. That's nasty.<br />
<br />
People often recommend <a href="http://code.google.com/appengine/docs/python/backends/overview.html">backends</a> for this kind of work. With them, you stick jobs on a pull queue, and pull them off with the backend. Each backend can process a limited amount of jobs, but you can set them to be automatically created in response to workload, which is cool.<br />
<br />
But I'm partial to <a href="http://code.google.com/appengine/docs/python/taskqueue/overview-push.html">push queues</a>, what were previously just called Task Queues. At any point in code you can schedule a task to run, which simply comes through as a post to a url in your app:<br />
<span class="Apple-style-span" style="background-color: #fafafa; color: #007000; font-family: monospace; font-size: 12px; line-height: 15px; white-space: pre;"><span class="pln" style="color: black;"><br /></span></span>
<span class="Apple-style-span" style="background-color: #fafafa; color: #007000; font-family: monospace; font-size: 12px; line-height: 15px; white-space: pre;"><span class="pln" style="color: black;"> taskqueue</span><span class="pun" style="color: #666600;">.</span><span class="pln" style="color: black;">add</span><span class="pun" style="color: #666600;">(</span><span class="pln" style="color: black;">url</span><span class="pun" style="color: #666600;">=</span><span class="str" style="color: #008800;">'/dosomething'</span><span class="pun" style="color: #666600;">,</span><span class="pln" style="color: black;"> </span><span class="kwd" style="color: #000088;">params</span><span class="pun" style="color: #666600;">={</span><span class="str" style="color: #008800;">'key'</span><span class="pun" style="color: #666600;">:</span><span class="pln" style="color: black;"> key</span><span class="pun" style="color: #666600;">})</span></span><br />
<br />
It's a bit clunky, because you need to set up a handler for the url, and implement the Post method.<br />
<br />
Oh wait, no you don't. Nick Johnson wrote the excellent <a href="http://code.google.com/appengine/articles/deferred.html">deferred.defer library</a>, which takes care of the public url and thunking the call from there into a method of your choice. So instead your call can look like this:<br />
<br />
<span class="Apple-style-span" style="background-color: white; font-family: Helvetica, Arial, sans-serif; font-size: x-small;"></span><br />
<pre class="prettyprint" style="background-color: #fafafa; border-bottom-color: rgb(187, 187, 187); border-bottom-style: solid; border-bottom-width: 1px; border-left-color: rgb(187, 187, 187); border-left-style: solid; border-left-width: 1px; border-right-color: rgb(187, 187, 187); border-right-style: solid; border-right-width: 1px; border-top-color: rgb(187, 187, 187); border-top-style: solid; border-top-width: 1px; color: #007000; font-family: monospace; font-size: 9pt; line-height: 15px; margin-top: 1em; overflow-x: auto; overflow-y: auto; padding-bottom: 0.99em; padding-left: 0.99em; padding-right: 0.99em; padding-top: 0.99em; word-wrap: break-word;"><span class="kwd" style="color: #000088;">from</span><span class="pln" style="color: black;"> google</span><span class="pun" style="color: #666600;">.</span><span class="pln" style="color: black;">appengine</span><span class="pun" style="color: #666600;">.</span><span class="pln" style="color: black;">ext </span><span class="kwd" style="color: #000088;">import</span><span class="pln" style="color: black;"> deferred
</span><span class="kwd" style="color: #000088;">def</span><span class="pln" style="color: black;"> do_something_expensive</span><span class="pun" style="color: #666600;">(</span><span class="pln" style="color: black;">a</span><span class="pun" style="color: #666600;">,</span><span class="pln" style="color: black;"> b</span><span class="pun" style="color: #666600;">,</span><span class="pln" style="color: black;"> c</span><span class="pun" style="color: #666600;">=</span><span class="kwd" style="color: #000088;">None</span><span class="pun" style="color: #666600;">):</span><span class="pln" style="color: black;">
logging</span><span class="pun" style="color: #666600;">.</span><span class="pln" style="color: black;">info</span><span class="pun" style="color: #666600;">(</span><span class="str" style="color: #008800;">"Doing something expensive!"</span><span class="pun" style="color: #666600;">)</span><span class="pln" style="color: black;">
</span><span class="com" style="color: #880000;"># Do your work here</span><span class="pln" style="color: black;">
</span><span class="com" style="color: #880000;"># Somewhere else</span><span class="pln" style="color: black;">
deferred</span><span class="pun" style="color: #666600;">.</span><span class="pln" style="color: black;">defer</span><span class="pun" style="color: #666600;">(</span><span class="pln" style="color: black;">do_something_expensive</span><span class="pun" style="color: #666600;">,</span><span class="pln" style="color: black;"> </span><span class="str" style="color: #008800;">"Hello, world!"</span><span class="pun" style="color: #666600;">,</span><span class="pln" style="color: black;"> </span><span class="lit" style="color: #006666;">42</span><span class="pun" style="color: #666600;">,</span><span class="pln" style="color: black;"> c</span><span class="pun" style="color: #666600;">=</span><span class="kwd" style="color: #000088;">True</span><span class="pun" style="color: #666600;">)</span></pre>
<br />
<br />
That's cool, isn't it!<br />
<br />
What's also cool about tasks is that you can delay them, either by specifying <span class="Apple-style-span" style="font-family: inherit;">a <span class="Apple-style-span" style="background-color: white; font-style: italic; line-height: 16px;">countdown </span><span class="Apple-style-span" style="background-color: white; line-height: 16px;">or an </span><span class="Apple-style-span" style="background-color: white; font-style: italic; line-height: 16px;">eta</span><span class="Apple-style-span" style="background-color: white; font-style: italic; line-height: 16px;">.</span><span class="Apple-style-span" style="background-color: white; line-height: 16px;"> Using a countdown (number of seconds before execution) is interesting, because you can delay tasks, ie: spread the work out a bit. But using an eta is really fascinating, because it lets you schedule work for specific times. So if you need to schedule an email to go out at midnight, a task with an eta will do that for you, with no real plumbing required on your part. (Can you do this with a pull queue? You may be able to use <i>eta</i> to stop tasks showing up through the lease system before a specified time, I'm not sure about this.)</span></span><br />
<br />
This is all great for performing scheduled background tasks. Except, what if they fail? Or take a long time to complete? In fact, how can you report on the status of these tasks? Well, you can't. There's no way to go in and find out much about the task through any APIs. Even if there was, you'd probably need custom information suited to the job at hand anyway.<br />
<br />
What I need is an object in the datastore that maps to the task. I personally prefer an object oriented approach (ok, I'm an old man set in my ways, yes I know). So, what I'd like is a base object which lets me set up a task, kick it off, record its progress, and lets me see afterwards how it went.<br />
<br />
So I created the Worker. The worker is a base class polymodel object, that you can use to do background jobs. You need to override it, and provide it with a job to do (doExecute()) and a method for calculating the next time to run if you want a repeating job (doCalculateNextRun()). You can also provide a specific queue name (override GetQueue()) and you can specify whether or not it should run immediately (override ExecuteImmediately()). If ExecuteImmediately() returns false, then on the first, immediate run it wont call doExecute(), but instead will call doCalculateNextRun() and reschedule itself.<br />
<br />
So for instance, if you want to run a background job immediately (say send an email), you make this class:<br />
<br />
<span class="Apple-style-span" style="background-color: white; font-family: monospace; white-space: pre;"> class SendAnEmailImmediately(Worker)</span><br />
<pre><span class="Apple-style-span" style="background-color: white;"> </span><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">
<pre> def doExecute(self):
logging.info("Sending emails to %s" % lemailStr)
lmessage = mail.EmailMessage(
sender="Anne@example.com",
to="Betty@example.com",
subject= "Hi Betty",
body="I know you love email!"
)
lmessage.send()
</pre>
</span></span><span class="Apple-style-span" style="background-color: white;"> def doCalculateNextRun(self, aUtcNow, alastDue):</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> return None # never reschedule</span></pre>
<br />
To kick it off, do this:<br />
<br />
<span class="Apple-style-span" style="font-family: monospace;"><span class="Apple-style-span" style="white-space: pre;">
<pre>
lsender = SendAnEmailImmediately()
lsender.status = 0
lsender.enabled = True
lsender.put()
</pre>
</span></span><br />
<pre><span class="Apple-style-span" style="font-family: 'Times New Roman'; white-space: normal;">
</span></pre>
<pre><span class="Apple-style-span" style="font-family: 'Times New Roman'; white-space: normal;">And what do you get out of that? Well, not only does the email get sent from a background task, but afterward you'll have a SendAnEmailImmediately object in the datastore, with these properties:</span></pre>
<br />
<pre><span class="Apple-style-span" style="background-color: white;"> lastRunSucceeded = db.BooleanProperty()
lastRunMessage = db.StringProperty()
lastRunStartTime = db.DateTimeProperty()
lastRunFinishTime = db.DateTimeProperty()
</span></pre>
<div>
which give you information on when it ran and how the worker actually went; did it fail? If so, what errors occurred? </div>
<div>
<br /></div>
<div>
How about a recurring task? Try this one, which sends an email once per hour:</div>
<div>
<br /></div>
<div>
<span class="Apple-style-span" style="font-family: monospace;"><span class="Apple-style-span" style="white-space: pre;"> class SendAnEmailEveryHour(Worker)
def doExecute(self):
logging.info("Sending emails to %s" % lemailStr)
lmessage = mail.EmailMessage(
sender="Anne@example.com",
to="Betty@example.com",
subject= "Hi again Betty",
body="Are you feeling loved yet?"
)
lmessage.send()
def doCalculateNextRun(self, aUtcNow, alastDue):
if alastDue:
lbaseDate = alastDue
else:
lbaseDate = aUtcNow
return lbaseDate + timedelta(minutes=60)
</span></span><br />
<span class="Apple-style-span" style="font-family: monospace;"><span class="Apple-style-span" style="white-space: pre;"><br /></span></span></div>
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">and again, kick it off like this:</span></span></div>
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><br /></span></span></div>
<div>
<pre>
lsender = SendAnEmailEveryHour()
lsender.status = 0
lsender.enabled = True
lsender.put()
</pre>
</div>
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">Ok, that'll work. However, what if we want a record of each run? Then do it like this instead:</span></span></div>
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><br /></span></span></div>
<div>
<span class="Apple-style-span" style="font-family: monospace;"><span class="Apple-style-span" style="white-space: pre;">
<pre>
class SendAnEmailEveryHour2(Worker)
def doExecute(self):
lsender = SendAnEmailImmediately()
lsender.status = 0
lsender.enabled = True
lsender.put()
def doCalculateNextRun(self, aUtcNow, alastDue):
if alastDue:
lbaseDate = alastDue
else:
lbaseDate = aUtcNow
return lbaseDate + timedelta(minutes=60)
</pre>
</span></span><br />
<div>
<span class="Apple-style-span" style="font-family: monospace;"><span class="Apple-style-span" style="white-space: pre;"><br /></span></span></div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">
</span></span>
<br />
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">So now you get a recurring worker kicking off other workers, one per job.</span></span></span></span></div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">
</span></span>
<br />
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><br /></span></span></span></span></div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">
</span></span>
<br />
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">You can see how powerful this is as a simple method of structuring background jobs!</span></span></span></span></div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">
</span></span>
<br />
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><br /></span></span></span></span></div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">
</span></span>
<br />
<div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">Ok, hold onto your hats, excuse my n00bish python, and get ready for a slab of code. Here's the implementation of Worker:</span></span></span></span></div>
<span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span">
</span></span></div>
<br />
<br />
<span class="Apple-style-span" style="background-color: white;">##################################################################</span><br />
<pre><span class="Apple-style-span" style="background-color: white;">from google.appengine.ext import db
from google.appengine.ext.db import polymodel
import logging
from datetime import datetime
from datetime import timedelta
from google.appengine.ext import deferred
from lib.pytz.gae import pytz
import uuid
</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;">
</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;">class Worker(polymodel.PolyModel):
nextDue = db.DateTimeProperty()
enabled = db.BooleanProperty()
status = db.IntegerProperty() # 0 = ready, 1 = running, 2 = stopped
lastRunSucceeded = db.BooleanProperty()
lastRunMessage = db.StringProperty() # only if
lastRunStartTime = db.DateTimeProperty()
lastRunFinishTime = db.DateTimeProperty()
createTime = db.DateTimeProperty(auto_now_add = True)
taskid = db.StringProperty()
</span><span class="Apple-style-span" style="background-color: white;"> # override to change queues</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span" style="font-family: 'Times New Roman'; white-space: normal;"><pre> def GetQueue(self):
return "default"
# override to do first run in the future
def ExecuteImmediately(self):
return True
</pre>
</span><span class="Apple-style-span" style="font-family: 'Times New Roman'; white-space: normal;"><pre> # must override to perform work
</pre>
</span><span class="Apple-style-span"> def doExecute(self):
raise NotImplementedError
</span><span class="Apple-style-span" style="font-family: 'Times New Roman'; white-space: normal;"><pre><span class="Apple-style-span" style="font-family: 'Times New Roman'; white-space: normal;"><pre> # override to tell us when next to run
</pre>
</span><span class="Apple-style-span"></span></pre>
</span><span class="Apple-style-span"> def doCalculateNextRun(self, aUtcNow, alastDue):
raise NotImplementedError
def Execute(self, aTaskID, aIsFirstRun, **kwargs):
try:
#Don't trust depickled self, go reload self</span></span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span" style="font-family: 'Times New Roman'; white-space: normal;"><pre><span class="Apple-style-span"> #Nick Johnson told me not to do this - needs to be fixed</span></pre>
</span> self = db.get(self.key())</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> except db.NotSavedError, ex:
self = None
lutcNow = datetime.utcnow()
if not self:
logging.warning("eek I am gone! (disappears in a puff of logic)")
elif not aTaskID:
logging.warning("No aTaskID, skipping")
elif aTaskID != self.taskid:
logging.debug("TaskIDs do not match, skipping")
elif not self.enabled:
logging.warning("Disabled, skipping")
elif self.status != 0:
logging.warning("Wrong status to execute Worker, status = %s, skipping" % (self.status))
elif self.nextDue and self.nextDue > lutcNow:</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> logging.debug("Don't run till %s, reschedule..." % (self.nextDue))
if (self.nextDue - lutcNow) > timedelta(1):</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> # don't reschedule more than a day forward
</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> lresched = lutcNow + timedelta(1) # add a day
else:
lresched = self.nextDue
lqueue = self.GetQueue()
deferred.defer(</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"> </span>self.Execute,</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"> </span>_queue_name=lqueue,</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"> </span>_eta=lresched,</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"> </span>aTaskID=self.taskid,</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"> </span>aIsFirstRun=aIsFirstRun,</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"> </span>)</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> else:
if aIsFirstRun and not self.nextDue and not self.ExecuteImmediately():
logging.debug("First run, don't execute")
else:
logging.debug("We can execute")
try:
self.status = 1 # running
self.lastRunStartTime = datetime.utcnow()
self.put()
logging.debug("Before doExecute()")
self.doExecute()
logging.debug("After doExecute()")
self.status = 0 # ready to run
self.lastRunSucceeded = True
self.lastRunMessage = None
except Exception, ex:
self.status = 0
self.lastRunSucceeded = False
self.lastRunMessage = unicode(ex)
logging.error(ex)
self.lastRunFinishTime = datetime.utcnow()
logging.debug("calculate lnextRun")</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> lnextRun = None</span></pre>
<pre><span class="Apple-style-span"><span class="Apple-style-span" style="background-color: white;"> try:</span></span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"><span class="Apple-style-span"> lutcnow = </span></span>datetime.utcnow()</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"> lnextRun = self.doCalculateNextRun(</span>datetime.utcnow(), self.nextDue)</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> except Exception, ex:</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> logging.error(ex)</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;">
if lnextRun:
logging.debug("got lnextRun, need to reschedule")
self.nextDue = lnextRun
self.status = 0
self.put()
lqueue = self.GetQueue()
if (lnextRun - lutcnow) > timedelta(1):
lresched = lutcnow + timedelta(1)
else:
lresched = lnextRun
if lresched <= lutcnow:
# run immediately, no eta provided
deferred.defer(</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> self.Execute, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> _queue_name=lqueue, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> aTaskID=self.taskid, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> aIsFirstRun=False</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> )
else:
# schedule future run
deferred.defer(</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> self.Execute, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> _queue_name=lqueue, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> _eta=lresched, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> aTaskID=self.taskid, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> aIsFirstRun=False</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> )
else:
logging.debug("no lnextRun, we are finished.")
self.status = 2
self.put()
</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> # Need to override put to kick off the task if enabled is </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> # set to True
def put(self, **kwargs):
lneedPut = True
# first grab a copy of what's currently stored.
logging.debug("Entered put, new self = %s" % (self))
loldself = None
if self.enabled:
logging.debug("Need to find out if enabled has been newly set. Load old self from datastore")
try:
loldself = self.get(self.key())
except Exception, ex:
logging.error(ex)
loldself = None
logging.debug("See if newly enabled has changed")
if self.enabled and (not loldself or not loldself.enabled):
logging.debug("Newly enabled. Need to schedule self to run")
self.taskid = unicode(uuid.uuid4())
logging.debug("taskid == %s" % (self.taskid))
logging.debug("Now schedule to run immediately")
#self.nextDue = None
self.status = 0
logging.debug("Pre-save")
super(Worker, self).put(**kwargs)
lneedPut = False
lqueue = self.GetQueue()
# run immediately
logging.debug("call deferred.defer")
deferred.defer(</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> self.Execute, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> _queue_name=lqueue, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> aTaskID=self.taskid, </span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"> aIsFirstRun=True</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;"><span class="Apple-style-span"> )
else:
logging.debug("not newly enabled")
if lneedPut:
logging.debug("Do the actual put")
super(Worker, self).put(**kwargs)
</span></span><span class="Apple-style-span" style="font-family: 'Times New Roman'; white-space: normal;"><span class="Apple-style-span" style="background-color: white;">##################################################################</span></span><span class="Apple-style-span" style="background-color: white; font-family: inherit;">
</span>
</pre>
A couple of footnotes before I leave you with this:<br />
<br />
1: I've used deferred.defer on a class method, which has an issue. Specifically, it has to pickle the whole class, then depickle it when the task is run. That's a little expensive, and it leaves the running task with an old version of the object. So, I have to do this:<br />
<pre><span class="Apple-style-span" style="background-color: white;">
</span></pre>
<pre><span class="Apple-style-span" style="background-color: white;">self = db.get(self.key())</span></pre>
<pre></pre>
<br />
to replace the passed in version of the object with the object from the datastore.<br />
<br />
What would be better would be if Execute were a class method, and I passed the self.key() as a parameter to it, then loaded the full instance using the key on entry to Execute. It's a simple change, but I want to test it before I change it here. I'm sure people will point out all kinds of issue, so I'll wait to change it until then.<br />
<br />
2: You'd think I'd have some kind of "Go()" method to kick things off, instead of using an override on put() to detect a change in the "enabled" property. However, I've been specifically using this in the context of a REST api, where I don't want to be calling methods. So, this method of overriding put() has been just the ticket. To complete the implementation I should also do some monkey patching of db.put(), but I haven't needed that yet and it's a minor PITA to do, so it's left to the reader for now. Actually, this approach of overriding put() to do work in a REST context is a paradigm I'll explore in detail in a subsequent post, it goes really well with the rest library <a href="http://code.google.com/p/appengine-rest-server/">appengine-rest-server</a>. Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-6465105012839105142011-10-01T15:44:00.000+09:302011-10-01T15:48:31.767+09:30The Amazing Story of AppEngine and the Two Orders Of Magnitude<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQtpjpODXrZ3OTAzQJ-9Kg38bEvWCg5-0LJbeAhwAxE2u-6kMXCcGHEzz2t-MU9lURAhwORUXCbVo88ql61IX34E4oI-5KVyvFwQghyphenhyphenZoB_e4k_sKXpDx46OVXcMl2Db4Tyb4G3vtfulE/s1600/iStock_000002378241XSmall.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQtpjpODXrZ3OTAzQJ-9Kg38bEvWCg5-0LJbeAhwAxE2u-6kMXCcGHEzz2t-MU9lURAhwORUXCbVo88ql61IX34E4oI-5KVyvFwQghyphenhyphenZoB_e4k_sKXpDx46OVXcMl2Db4Tyb4G3vtfulE/s200/iStock_000002378241XSmall.jpg" width="133" /></a></div>
<br />
My first shot at blogging about AppEngine was the four part series about the new pricing model and how it'll affect me with my app Syyncc.<br />
<br />
<ul>
<li><span class="Apple-style-span" style="background-color: white; line-height: 24px;"><a href="http://point7.wordpress.com/2011/09/03/the-amazing-story-of-appengine-and-the-two-orders-of-magnitude/" rel="bookmark" style="text-decoration: none;" title="Permanent Link: The Amazing Story Of AppEngine And The Two Orders Of Magnitude"><span class="Apple-style-span" style="font-family: inherit;">The Amazing Story Of AppEngine And The Two Orders Of Magnitude</span></a></span></li>
<li><span class="Apple-style-span" style="background-color: white; line-height: 28px;"><a href="http://point7.wordpress.com/2011/09/04/appengine-tuning-1/" rel="bookmark" style="text-decoration: none;" title="Permanent Link: AppEngine Tuning #1"><span class="Apple-style-span" style="font-family: inherit;">AppEngine Tuning #1</span></a></span></li>
<li><span class="Apple-style-span" style="background-color: white; line-height: 28px;"><a href="http://point7.wordpress.com/2011/09/07/appengine-tuning-an-instance-of-success/" rel="bookmark" style="text-decoration: none;" title="Permanent Link: AppEngine Tuning – An instance of success?"><span class="Apple-style-span" style="font-family: inherit;">AppEngine Tuning – An instance of success?</span></a></span></li>
<li><span class="Apple-style-span" style="background-color: white; line-height: 28px;"><a href="http://point7.wordpress.com/2011/09/10/appengine-tuning-schlemiel-youre-fired/" rel="bookmark" style="text-decoration: none;" title="Permanent Link: AppEngine Tuning – Schlemiel, you’re fired!"><span class="Apple-style-span" style="font-family: inherit;">AppEngine Tuning – Schlemiel, you’re fired!</span></a></span></li>
</ul>
<br />
<span class="Apple-style-span" style="font-family: inherit;">I need to do some more followup, because I think I was wrong about how instance pricing works (there's some discussion of that in the comments on the last post I think). If so, the picture is even rosier than I painted it!</span>Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com0tag:blogger.com,1999:blog-4275632007165220704.post-49744557273731591482011-10-01T14:46:00.000+09:302011-10-01T15:05:12.980+09:30Frist<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQlUZaYR3oLGK02Ywh6pB9NPeCF-x3S_Y1XS2aG_Xg1CMOiTcEGcEsCkH0Jek9zzCELd6gBToNTVDCyFRAo_78bQBs9VlvlVnRy29ORNdWcN42NZQfpephGSWPYeLGYISyVww5MVHaMJo/s1600/skinnedknee.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQlUZaYR3oLGK02Ywh6pB9NPeCF-x3S_Y1XS2aG_Xg1CMOiTcEGcEsCkH0Jek9zzCELd6gBToNTVDCyFRAo_78bQBs9VlvlVnRy29ORNdWcN42NZQfpephGSWPYeLGYISyVww5MVHaMJo/s1600/skinnedknee.gif" /></a></div>
<br />
<br />
First post for my professional <a href="http://appengine.google.com/">AppEngine</a> blog. Hi all!<br />
<br />
I've been inspired to write this blog based on my work with AppEngine. Until recently, that work was fairly light on, comprising mostly my application <a href="http://my.syyn.cc/">Syyncc</a> . However, I recently took on a new role with <a href="http://ecampus.com.au/">Ecampus</a>, in which we're pushing forward with new product development in AppEngine. And in beginning that work I've learned some things.<br />
<br />
Firstly, I've learned that I barely touched on the capabilities of AppEngine and web development with Syyncc. That app has some interesting back end work going on, but in terms of web apps it's fairly primitive. So, as I now contemplate some serious development in AppEngine (and by "contemplate" I mean "struggle to do"), the gaps in my knowledge are making themselves known.<br />
<br />
Secondly, it's becoming clear that AppEngine is deep and wide. This isn't just some toy system, it's got chops. And while it might look a bit like LAMP stack style virtual web hosting, it is much more than that. But, it is also different. It's a serious distributed high level PaaS environment, and to make decent use of it, I'm going to need to understand it on its own terms. So for instance, the <a href="http://code.google.com/appengine/docs/python/datastore/hr/overview.html">HRD</a> is not like a SQL database, it's like HRD.<br />
<br />
Thirdly, I'm going to need some frameworks. AppEngine provides amazing features, but they tend to be platform focused, rather than application focused. For professional calibre work, you need libraries, frameworks, to bridge the gap. There do seem to be things around the place, but they're a bit dispersed, many are open source projects in states of gentile decline, and generally there's not much of a big picture. So I'm going to need to do some hard yards pulling a pro framework together myself.<br />
<br />
In light of these things, I've decided that I need to put a serious effort into skilling up and creating a codebase of app frameworks. I'll be diving deep into various AppEngine related topics, cutting code, testing hypotheses.<br />
<br />
And you know, why not flaunt my ignorance in public? Thus, this blog. I invite you, dear reader, to watch as I stumble my way blindly through this new landscape, skin my knee here and there, and possibly create something useful along the way. And, if like the look of those skinned knees and elbows and want some of your own, then jump in, speak up, and we can plow on painfully together!Anonymoushttp://www.blogger.com/profile/11980745475562786998noreply@blogger.com1