Blogs
Submitted by Neil Rubens on Tue, 01/12/2010 - 09:27
Great lecture; a must watch: http://videolectures.net/kdd09_hand_mmwrdd/ Mismatched Models, Wrong Results, and Dreadful Decisions author: David J. Hand, Department of Mathematics, Imperial College London Description Data mining techniques use score functions to quantify how well a model fits a given data set. Parameters are estimated by optimising the fit, as measured by the chosen score function, and model choice is guided by the size of the scores for the different models. Since different score functions summarise the fit in different ways, it is important to choose a function which matches the objectives of the data mining exercise. For predictive classification problems, a wide variety of score functions exist, including measures such as precision and recall, the F measure, misclassification rate, the area under the ROC curve (the AUC), and others. The first four of these require a classification threshold to be chosen, a choice which may not be easy, or may even be impossible, especially when the classification rule is to be applied in the future. In contrast, the AUC does not require the specification of a classification threshold, but summarises performance over the range of possible threshold choices. However, unfortunately, and despite the widespread use of the AUC, it has a previously unrecognised fundamental incoherence lying at the core of its definition. This means that using the AUC can lead to poor model choice and unecessary misclassifications. The AUC is set in context, its deficiency explained and the implications illustrated - with the bottom line being that the AUC should not be used. A family of coherent alternative scores is described. The ideas are illustrated with examples from bank loans, fraud, face recognition, and health screening.
Submitted by Neil Rubens on Tue, 09/08/2009 - 16:09
iPhone iPod delete all music videos applications etc. To accomplish the above goal. Go to iTunes; and then to the tab e.g. "music" from which you want to delete everything and un-check "Sync .." this should get rid of it.
Submitted by Neil Rubens on Wed, 09/02/2009 - 23:29
I always run into this trouble when I compile latex files. Here is the solution / reminder P.S. dont forget to specify \bibliographystyle
http://forums.macnn.com/82/applications/88947/help-with-bibtex-and-texshop/
1. Put the citations in your .tex file in the form
\cite{<key>}. Papers you have not cited will not appear
in the bibliography.
2. Put \bibliography{<bibfilename>} in your .tex file
where you want the bibliography to appear. Make sure the .bib file is
somewhere that latex can find it, such as the same folder as the .tex
file.
3. Run latex then bibtex then latex then latex again, all on your .tex
file (actually you run bibtex on the .aux file, but texshop does this
for you).
Check the results.
For more details, see appendix B of the latex book, or chapter 13 of the latex companion.
Submitted by Neil Rubens on Tue, 08/25/2009 - 15:35
Books:
The Grammar of Graphics, Leland Wilkinson Visualizing Data, William S. Cleveland The Visual Display of Quantitative Information, Edward Tufte Information Visualization: Perception for Design, Colin Ware Show Me the Numbers: Designing Tables and Graphs to Enlighten, Stephen Few
Tools
Tableau (pros: one of the best data exploration tools, free for open data; cons: somewhat costly ~$1,700) Pentaho Reporting Pivot (Microsoft) http://www.getpivot.com Many-Eyes http://many-eyes.com Verfiable http://verifiable.com TimeSearcher http://www.cs.umd.edu/hcil/timesearcher Parvis http://home.subnet.at/flo/mv/parvis Improvise http://www.cs.ou.edu/~weaver/improvise/
GGobi http://ggobi.org Interactive, brushing, etc.
Tree Network Tools
GraphViz http://www.graphviz.org NodeXL http://www.codeplex.com/NodeXL GUESS http://graphexploration.cond.org/ Pajek http://pajek.imfm.si/doku.php TreeMap http://www.cs.umd.edu/hcil/treemap Workbench http://nwb.slis.indiana.edu/
Programming Tools
processing.org A popular graphics language protovis.org Visualization tools for JavaScript flare.prefuse.org Visualization tools for Flash prefuse.org Visualization tools for Java modestmaps.com Mapping tools for Flash/JavaScript
People / Blogs
Andrew (bloger) Nathan (bloger) Jeffrey Heer (visualization librariries) Katy Borner (visualization of science)
Reference
Some of the recommendations are by Jefferey Heer (an expert in the area) given at the MediaX 2009 workshop
Various
smoothScatter produces a smoothed color density representation of the scatterplot, obtained through a kernel density estimate.
Submitted by Neil Rubens on Wed, 08/05/2009 - 20:16
Submitted by Neil Rubens on Wed, 07/15/2009 - 19:22
Problem
You cannot install numpy on this volume. numpy requires System Python to install os x
SolutionNumpy has several files depending on your version of python e.g. 2.5, 2.6. Make sure you download the right one.
Submitted by Neil Rubens on Fri, 07/10/2009 - 07:17
For more information see the following guide kindly provided by Google. Here is the partial copy: Standard Directory and Package Layout GWT projects are overlaid onto Java packages such that most of the configuration can be inferred from the classpath and the module definitions. Guidelines If you are not using the Command-line tools to generate your project files and directories, here are some guidelines to keep in mind when organizing your code and creating Java packages. - Under the main project directory create the following directories:
- src folder - contains production Java source
- war folder - your web app; contains static resources as well as compiled output
- test folder - (optional) JUnit test code would go here
- Within the src package, create a project root package and a client package.
- If you have server-side code, also create a server package to differentiate between the client-side code (which is translated into JavaScript) from the server-side code (which is not).
- Within the project root package, place one or more module definitions.
- In the war directory, place any static resources (such as the host page, style sheets, or images).
- Within the client and server packages, you are free to organize your code into any subpackages you require.
Example: GWT standard package layout For example, all the files for the "DynaTable" sample are organized in a main project directory also called "DynaTable". - Java source files are in the directory: DynaTable/src/com/google/gwt/sample/dynatable
- The module is defined in the XML file: DynaTable/src/com/google/gwt/sample/dynatable/DynaTable.gwt.xml
- The project root package is: com.google.gwt.sample.dynatable
- The logical module name is: com.google.gwt.sample.dynatable.DynaTable
The src directory The src directory contains an application's Java source files, the module definition, and external resource files. Package | File | Purpose | com.google.gwt.sample.dynatable | | The project root package contains module XML files. | com.google.gwt.sample.dynatable | DynaTable.gwt.xml | Your application module. Inherits com.google.gwt.user.User and adds an entry point class, com.google.gwt.sample.dynatable.client.DynaTable. | com.google.gwt.sample.dynatable | | Static resources that are loaded programmatically by GWT code. Files in the public directory are copied into the same directory as the GWT compiler output. | com.google.gwt.sample.dynatable | logo.gif | An image file available to the application code. You might load this file programmatically using this URL: GWT.getModuleBaseURL() + "logo.gif". | com.google.gwt.sample.dynatable.client | | Client-side source files and subpackages. | com.google.gwt.sample.dynatable.client | DynaTable.java | Client-side Java source for the entry-point class. | com.google.gwt.sample.dynatable.client | SchoolCalendarService.java | An RPC service interface. | com.google.gwt.sample.dynatable.server | | Server-side code and subpackages. | com.google.gwt.sample.dynatable.server | SchoolCalendarServiceImpl.java | Server-side Java source that implements the logic of the service. | The war directory The war directory is the deployment image of your web application. It is in the standard expanded war format recognized by a variety of Java web servers, including Tomcat, Jetty, and other J2EE servlet containers. It contains a variety of resources: - Static content you provide, such as the host HTML page
- GWT compiled output
- Java class files and jar files for server-side code
- A web.xml file that configures your web app and any servlets
A detailed description of the war format is beyond the scope of this document, but here are the basic pieces you will want to know about: Directory | File | Purpose | DynaTable/war/ | DynaTable.html | A host HTML page that loads the DynaTable app. | DynaTable/war/ | DynaTable.css | A static style sheet that styles the DynaTable app. | DynaTable/www/dynatable/ | | The DynaTable module directory where the GWT compiler writes output and files on the public path are copied. NOTE: by default this directory would be the long, fully-qualified module name com.google.gwt.sample.dynatable.DynaTable. However, in our GWT module XML file we used the rename-to="dynatable" attribute to shorten it to a nice name. | DynaTable/www/dynatable/ | dynatable.nocache.js | The "selection script" for DynaTable. This is the script that must be loaded from the host HTMLto load the GWT module into the page. | DynaTable/war/WEB-INF | | All non-public resources live here, see the servlet specification for more detail. | DynaTable/war/WEB-INF | web.xml | Configures your web app and any servlets. | DynaTable/war/WEB-INF/classes | | Java compiled class files live here to implement server side functionality. If you're using an IDE set the output directory to this folder. | DynaTable/war/WEB-INF/lib | | Any library dependencies your server code needs goes here. | DynaTable/war/WEB-INF/lib | gwt-servlet.jar | If you have any servlets using GWT RPC, you will need to place a copy of gwt-servlet.jar here. | The test directory The test directory contains the source files for any JUnit tests. Package | File | Purpose | com.google.gwt.sample.dynatable.client | | Client-side test files and subpackages. | com.google.gwt.sample.dynatable.client | DynaTableTest.java | Test cases for the entry-point class. | com.google.gwt.sample.dynatable.server | | Server-side test files and subpackages. | com.google.gwt.sample.dynatable.server | SchoolCalendarServiceImplTest.java | Test cases for server classes. |
Submitted by Neil Rubens on Thu, 07/09/2009 - 22:14
ProblemCross-site RPC seemed to work with JSON but not with XMLSolutionStrip out white characters (including new lines)KeywordsXML GWT AJAX XML JSON Google App Engined python cross site web service
Submitted by Neil Rubens on Tue, 07/07/2009 - 19:29
TaskWant to run both java and python by using the same application (Note: this really only makes sense if you want to use common services such as datastore, memcache, queue, etc.; if not just deploy them as separate applications (doubles your quota) and communicate between them by using web services). SolutionYou can simply deploy them to different versions. Note versions don't have to be numeric. You can deploy your java code to version "java" and the corresponding url will be http://java.latest.YourApp.appspot.com ; and deploy your python to http://py.latest.YourApp.appspot.com by using version "py" You can let java and python versions communicate between each other by using JSON (more precisely JSONP [for cross site requests]) http://code.google.com/webtoolkit/tutorials/1.6/Xsite.html Using GWT also makes this job somewhat easier Keywordsgae app same both java python simultaneously java and python both java and python app id appid together google app engine gwt
Submitted by Neil Rubens on Fri, 07/03/2009 - 19:59
Problem
WARNING: Failed startup of context com.google.apphosting.utils.jetty.DevAppEngineWebAppContext java.util.zip.ZipException: error in opening zip file
HTTP ERROR: 503
SERVICE_UNAVAILABLE
RequestURI=/T1.html
Powered by jetty://
http://localhost:8080/T1.html
Jul 3, 2009 10:58:46 AM com.google.apphosting.utils.jetty.JettyLogger warn
WARNING: Failed startup of context com.google.apphosting.utils.jetty.DevAppEngineWebAppContext@32df24{/,/Volumes/TRASCEND/docs/neil/Research/GroupFormation/code/GAE/t1/war}
java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:203)
at java.util.jar.JarFile.<init>(JarFile.java:132)
at java.util.jar.JarFile.<init>(JarFile.java:97)
at org.mortbay.jetty.webapp.TagLibConfiguration.configureWebApp(TagLibConfiguration.java:171)
at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1215)
at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500)
at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117)
at org.mortbay.jetty.Server.doStart(Server.java:217)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at com.google.appengine.tools.development.JettyContainerService.startContainer(JettyContainerService.java:147)
at com.google.appengine.tools.development.AbstractContainerService.startup(AbstractContainerService.java:116)
at com.google.appengine.tools.development.DevAppServerImpl.start(DevAppServerImpl.java:211)
at com.google.appengine.tools.development.gwt.AppEngineLauncher.start(AppEngineLauncher.java:86)
at com.google.gwt.dev.HostedMode.doStartUpServer(HostedMode.java:365)
at com.google.gwt.dev.HostedModeBase.startUp(HostedModeBase.java:590)
at com.google.gwt.dev.HostedModeBase.run(HostedModeBase.java:397)
at com.google.gwt.dev.HostedMode.main(HostedMode.java:232)
The server is running at http://localhost:8080/
2009-07-03 19:58:46.975 java[3373:80f] [Java CocoaComponent compatibility mode]: Enabled
2009-07-03 19:58:46.976 java[3373:80f] [Java CocoaComponent compatibility mode]: Setting timeout for SWT to 0.100000
SCFinderPlugin(114): Unable to get bundle identifier.SCFinderPlugin(114): Unable to get bundle identifier.SCFinderPlugin(114): Unable to get bundle identifier.
SolutionIt seems to be caused by "._" (dot underscore) files created by OSX when a non osx partition is used I have created the project on the mac partition and to my surprise it fixed the problem (so much for paying premium for an increased productivity on mac)
|