Advanced Query Manipulation with MySQL ProxyAdvanced Query Manipulation with MySQL Proxy Kay Roepke (Sun Microsystems) From the official conference description at http://www.mysqlconf.com/mysql2009/public/schedule/detail/7040 Currently MySQL Proxy only comes with an incomplete tokenizer for a subset of the MySQL dialect. Many use cases require more knowledge about the query that a stream of tokens can provide and users are force to create their own parsers, most of which are handwritten and simplistic, in Lua. While this is often sufficient for special cases and specific applications, it cannot serve as an extensible and robust framework. For the purpose of query formatting in the MySQL Enterprise Monitor I have written a new parser using ANTLR, where I am a committer in the project. Due to the nature of ANTLR generated recognizers, it is possible to target different implementation languages with little effort and thus has been integrated with MySQL Proxy to allow accurate query recognition and manipulation. Even though a parser that accurately recognizes MySQLs dialect is a crucial step in supporting sharding, for example, it is only the first step. The supporting infrastructure in MySQL Proxy is at least as important. The session will very briefly introduce ANTLR and its development tools to provide some valuable background. I will show how to use the built-in symbol table, the produced abstract syntax tree and the built-in visitors to gather information about the query which assist the developer in safely manipulating the query. As an example I will demonstrate how to implement a sharding solution that is able to correctly retrieve data when given a query that uses subqueries, where one of the queries refers to data that actually resides on a different server. For example: SELECT id, name, email FROM users WHERE id IN (SELECT user_id FROM forum_members WHERE forum_id = ? and user_id > ?) The table users in this example is sharded whereas forum_members is not. The goal is to transparently retrieve all necessary data in a way that the application does not need to know the specifics about what is sharded and where the actual data resides. This is also a valuable tool for debugging purposes. Another common problem in large MySQL deployments arises when automatically generated SQL contains an error. Often the standard MySQL error messages are not helpful, particularly when there are incorrect values in IN clauses or parts of the WHERE clause are missing: The default error message often does not give enough information to discern which query was incorrect. In these cases it is desirable to override the reported error message with something that explains the actual error and gives more information about the error, such as more context, more locality, or in specific cases like with IN, less locality. The common error where a NULL value is stringified to an empty string, thus making the query fail to execute, can be handled in several ways: * Either by rewriting the query to not containing two consecutive , characters in expressions and give a warning, * or to give an error that contains better information than: SELECT * FROM test WHERE a IN (1,,10); [...] for the right syntax to use near 10) at line 1 I will demonstrate both use cases and show strategies to implement custom error reporting. Legacy applications and/or databases often refer to or contain old schema names, thus requiring either views or modification to work correctly. Sometimes it is not feasible or impossible to convert applications and the transformations on the query might need supporting logic in such a way that a view cannot be used. For such cases I will discuss solutions and demonstrate sample implementations using on-the-fly query rewriting. code pour embarquer la vidéo : >>> http://www.youtube.com/embed/LuwScTm5zrY <<< |