tag:blogger.com,1999:blog-9678896862474142712024-03-14T07:07:25.928+01:00Picviz LabsWelcome to the Picviz Labs Blog!
We value the communication and contributions of those who share a passion for the latest advances in big-data and computer-security practices in today's world.Picviz Labshttp://www.blogger.com/profile/04378166596319186292noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-967889686247414271.post-86983655504264795132012-10-22T15:39:00.000+02:002012-10-22T16:13:01.491+02:00A Collective Approach To Fraud Detection<span style="font-family: Arial, Helvetica, sans-serif;">Fraudulent activity in today’s business world is a
booming industry in its own right.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">More and more organised criminal groups and individuals actively
seek and implement new ways in which they can <b style="mso-bidi-font-weight: normal;">extort, steal and extract valuable intellectual property, funds,
business intelligence and confidential data </b>from companies of all sizes within
all industries. Unfortunately, most of this theft often takes place without the
knowledge of the organisation/s affected until well after the event has
occurred and the money, data or IP is already lost, abused, sold on or made
public.</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">Adding further complexity to the fight against these
damaging fraudulent activities is the huge and ever-growing reliance of and advances
in technology that place <b style="mso-bidi-font-weight: normal;">cyber-crime</b>
and <b style="mso-bidi-font-weight: normal;">hacking expertise</b> at the
fore-front of this conduct.</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<h3 class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="color: #c0504d; mso-themecolor: accent2;"><span style="font-family: Arial, Helvetica, sans-serif;"><span style="color: orange;">How should companies defend themselves against fraud?</span></span></span></h3>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">What is the best fraud detection practice for businesses
today (and for the future) to help them quickly, easily and pro-actively detect
fraudulent activity when it happens internally or by an attack from external source?</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">A common and sometimes successful approach is to assemble
an internal fraud detection team.<span style="mso-spacerun: yes;"> </span>Hiring
experienced security or data specialists to implement a complex, technical and
expensive monitoring system that only they can extract the relevant data from
is a familiar practice.<span style="mso-spacerun: yes;"> </span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">Although this sounds like a rational way to address such
an important need within a company, it can instead lead to a costly,
restrictive and limited solution.<span style="mso-spacerun: yes;">
</span>Despite their best efforts, the security and data specialists<b style="mso-bidi-font-weight: normal;"> may only be in a position to discover
‘known’ methods of fraudulent behaviour.</b><span style="mso-spacerun: yes;">
</span>Other experienced staff within a company, often better placed to notice fraudulent
activity in and around their own areas of expertise and responsibility, may never
get the chance to detect acts of fraud occurring directly around them.</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<h3 class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="color: #c0504d; mso-themecolor: accent2;"><span style="font-family: Arial, Helvetica, sans-serif;"><span style="color: orange;">Key questions can be derived from this method of managing fraud detection:</span></span></span></h3>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">Can a small team of experts effectively produce the best
results in their employer’s fight against fraud? </span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">Or instead could a collective solution, involving many
people within an organisation (not just security or data specialists) intuitively
highlighting fraudulent behaviour, produce better, more accurate results?</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">Below is a very recent scenario that can qualify these key
questions.</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<h3 class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="color: #c0504d; mso-themecolor: accent2;"><span style="font-family: Arial, Helvetica, sans-serif;"><span style="color: orange;">Head of fraud detection team steals from her own employer</span></span></span></h3>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">In September 2012, the press heavily reported on the
criminal case of <b style="mso-bidi-font-weight: normal;">a former Head of Online
Security</b> <b style="mso-bidi-font-weight: normal;">stealing £2.4 million</b>
from the UK’s Lloyds Bank, where she was a respected, senior member of staff.<span style="mso-spacerun: yes;"> </span>In summary, the person in charge of managing the
bank’s own online fraud detection team and relevant systems decided that she
was working harder and longer hours than her decent basic salary warranted.<span style="mso-spacerun: yes;"> </span>Subsequently, to top-up her income, she
submitted fake invoices for technology based projects and services <b style="mso-bidi-font-weight: normal;">over a 3 year period</b> that never actually
happened or existed!</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">There are many things wrong with this <b style="mso-bidi-font-weight: normal;">very real scenario</b>.<span style="mso-spacerun: yes;"> </span>That Lloyds allowed such unsophisticated, fraudulent
activity to continue undetected for so long is quite incredible.<span style="mso-spacerun: yes;"> </span>Furthermore, the person in charge of
detecting fraudulent activity for the firm, a person of trust and
responsibility, fell into the trap of defrauding her own employers.<span style="mso-spacerun: yes;"> </span>Unbelievably, she simultaneously maintained
the company’s extensive fraud detection environment!<span style="mso-spacerun: yes;"> </span>The concept of a ‘collective approach’ to
fraud detection may never have allowed this situation to occur.</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
<span style="font-family: "Calibri","sans-serif"; font-size: 11pt; line-height: 115%; mso-ansi-language: EN-GB; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;">
</span></span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">A company the size of Lloyds Bank may be able to absorb
such a financial loss.<span style="mso-spacerun: yes;"> </span>The <b style="mso-bidi-font-weight: normal;">reputational damage</b> caused by the
perception that they cannot be trusted to protect their customers’ funds and confidential
data could actually cost them a lot more.<span style="mso-spacerun: yes;">
</span>Independent of Lloyds Bank, losing over £2 million may have put a
smaller company out of business completely before any concern of reputational
damage had even arisen.</span><br />
<span style="font-family: Arial;"></span> </div>
<h3 class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="color: orange; font-family: Arial, Helvetica, sans-serif;">The 'Collective Approach' - A smarter way to manage fraud detection</span></h3>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;">Consider involving the majority of staff within an
organisation in the fraud detection process rather than limiting it to a select
few experts.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">Imagine your company, typically sub-divided into teams, departments
and areas of specific business functions, implemented a solution that was
relevant and available to each individual business group.<span style="mso-spacerun: yes;"> </span>The systems, processes, input and output
commonly utilised and produced by these individual groups would be accessible
via this solution.<span style="mso-spacerun: yes;"> </span>The staff within each
team, as a slight extension of their existing roles and utilising the familiarity
and experience of their own function, can now actively notice and report
suspicious and possibly fraudulent behaviour themselves.<span style="mso-spacerun: yes;"> </span>In theory, operating this type of collective solution
applies the practice of <b style="mso-bidi-font-weight: normal;">‘Neighbourhood
Watch’</b> to fraud detection within a business, with staff keeping an eye out
on their local environment.</span></div>
<br />
<span style="font-family: Arial, Helvetica, sans-serif;">In turn, this practice will <b style="mso-bidi-font-weight: normal;">create an army of people fighting fraud detection</b> within a company,
all specialists in their own area of employment and readily aware of what to
look out for. This collective approach would also condense the need for a
dedicated, isolated and expensive fraud detection team and establish a greater
level of responsibility and awareness by all staff within a company to the
pitfalls of fraudulent behaviour.<span style="mso-spacerun: yes;"> </span></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">For the detractors out there, claiming that this could instead
create a distracting culture of blame and distrust within a business, and that
some staff may lack the motivation to participate in such an approach, think
again.<span style="mso-spacerun: yes;"> </span>If some form of un-detected fraud
causes huge financial losses for a company, putting jobs and salaries at risk, <b style="mso-bidi-font-weight: normal;">any diligent employee would pro-actively participate</b>
in this type of collective solution to ensure that situation is never allowed
to occur.</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">Here at Picviz, we are working to provide this type of cost
effective, collective solution, to empower companies of all sizes to better
manage their fraud detection needs.<span style="mso-spacerun: yes;"> </span>Be
sure to get in touch with us to learn more about our solution and how your fraud
detection practices can easily be enhanced for the future.</span></div>
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
</div>
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span lang="FR" style="mso-ansi-language: FR;"><span style="font-family: Arial, Helvetica, sans-serif;">Dean Edwards<o:p></o:p></span></span></div>
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">
<span lang="FR" style="mso-ansi-language: FR;">Picviz Labs -
2012 Assises de la Sécurité Award Winner for Innovation<o:p></o:p></span></span></div>
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><a href="http://www.picviz.com/"><span lang="FR" style="mso-ansi-language: FR;"><span style="color: blue; font-family: Arial, Helvetica, sans-serif;">www.picviz.com</span></span></a></div>
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">
@picviz</span></div>
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">
@deanedwards78<span lang="FR" style="mso-ansi-language: FR;"><o:p></o:p></span></span></div>
<div class="MsoNoSpacing" style="margin: 0cm 0cm 0pt;">
</div>
Anonymousnoreply@blogger.com2tag:blogger.com,1999:blog-967889686247414271.post-43350397669222204312012-05-02T10:17:00.000+02:002012-05-02T10:33:43.884+02:00A Concise Introduction to using CSS with Qt Classes and Custom-Made Classes<h2 style="color: #ffd966; font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
</h2>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
Qt is a really nice, efficient framework. It's a real pleasure to be able to style objects with CSS-like declarations. However, I had a hard time to make my own Qt widgets interact faithfully with my CSS declarations.<br />
<br /></div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
I think the main reason is because the most relevant part of Qt's documentation is a bit scattered in it's structure and is not easily linked when you browse through it.</div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
This post is a quick introduction and summary of what you should know to work efficiently with Qt's stylsheets.</div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br /></div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<b>Note!</b>: <i>This post is about Qt version 4.8. It should also be valid, more or less, for older or newer versions of Qt.</i></div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br /></div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<b>Note! #2</b>: <i>This post is not a tutorial. it is intended as a collection of pointers to the most relevant part of Qt's documentation.</i></div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br /></div>
<h3 style="color: #f1c232; font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
The Very Basics</h3>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br />
First of all, if you are not very familiar with CSS stylesheets, or if you think you forgot how selectors work in some cases, you should have a look at <a href="http://qt-project.org/doc/qt-4.8/stylesheet-syntax.html" target="_blank">Qt's Style Sheet Syntax</a>.</div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br /></div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
Secondly, if you want some information on the way the Box Model works, and what content/padding/border/margin means, go to the <a href="http://qt-project.org/doc/qt-4.8/stylesheet-customizing.html#sub-controls" target="_blank">Customizing Qt Widgets Using Style Sheets</a> page.<br />
<br /></div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
Don't miss the explanations on sub-controls at the end of the page: they are very specific for Qt widgets.<br />
<br /></div>
<h3 style="color: #f1c232; font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
The Reference Document</h3>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br />
You would be wise to print or bookmark the <a href="http://qt-project.org/doc/qt-4.8/stylesheet-reference.html" target="_blank">Qt Style Sheets Reference</a>. Whenever you need to know what can be done with Qt Style Sheets, and how to do it, you will find the answers within the reference.<br />
<br />
If you need some examples for a specific kind of widget, take a look at <a href="http://qt-project.org/doc/qt-4.8/stylesheet-examples.html" target="_blank">Qt Qtyle Sheets Examples</a>.</div>
<h3 style="color: #f1c232; font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
</h3>
<h3 style="color: #f1c232; font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
What to do with your Own Widgets?</h3>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br />
Sometimes, in your own project, you need to derive from a QWidget. If you do so, you need to pay special attention to what is said in the <a href="http://qt-project.org/doc/qt-4.8/stylesheet-reference.html" target="_blank">Qt Style Sheets Reference</a> about QWidget, especially if you still wish to use Qt Style sheets to customize the look and feel of your own classes:</div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br /></div>
<div style="color: #f9cb9c; font-family: "Courier New",Courier,monospace;">
If you subclass from <a href="http://qt-project.org/doc/qt-4.8/qwidget.html">QWidget</a>, you need to provide a paintEvent for your custom <a href="http://qt-project.org/doc/qt-4.8/qwidget.html">QWidget</a> as below:</div>
<pre class="cpp" style="color: #f9cb9c; font-family: "Courier New",Courier,monospace;"> <span class="type">void</span> CustomWidget<span class="operator">::</span>paintEvent(<span class="type"><a href="http://qt-project.org/doc/qt-4.8/qpaintevent.html">QPaintEvent</a></span> <span class="operator">*</span>)
{
<span class="type"><a href="http://qt-project.org/doc/qt-4.8/qstyleoption.html">QStyleOption</a></span> opt;
opt<span class="operator">.</span>init(<span class="keyword">this</span>);
<span class="type"><a href="http://qt-project.org/doc/qt-4.8/qpainter.html">QPainter</a></span> p(<span class="keyword">this</span>);
style()<span class="operator">-</span><span class="operator">></span>drawPrimitive(<span class="type"><a href="http://qt-project.org/doc/qt-4.8/qstyle.html">QStyle</a></span><span class="operator">::</span>PE_Widget<span class="operator">,</span> <span class="operator">&</span>opt<span class="operator">,</span> <span class="operator">&</span>p<span class="operator">,</span> <span class="keyword">this</span>);
}</pre>
<div style="color: #f9cb9c; font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="color: #f9cb9c; font-family: "Courier New",Courier,monospace;">
The above code is a no-operation if there is no stylesheet set.</div>
<div style="color: #f9cb9c; font-family: "Courier New",Courier,monospace;">
<b>Warning:</b> Make sure you define the <a href="http://qt-project.org/doc/qt-4.8/qobject.html#Q_OBJECT">Q_OBJECT</a> macro for your custom widget.</div>
<div style="color: #f9cb9c; font-family: "Courier New",Courier,monospace;">
<br /></div>
<h3 style="color: #f1c232; font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
Updating CSS: don't recompile! Reload Style Sheets...</h3>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br />
When you are in the process of fine-tuning your Style Sheets and when you constantly need to check the result of these small modifications, it is not always practical to constantly recompile your project. </div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br />
I usually define a global (application-wide) shortcut that dynamically reloads the main Qt Style Sheet from a specific location on the hard drive (and not from the resources pseudo-file system because these resources can only be modified by recompiling the project...).<br />
<br />
This process can save a lot of time!</div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br />
For those interested by this way of working, here is a quick'n dirty way of doing it. It is based on the idea of accessing a file on the hard drive and not via the Qt Resource System. Otherwise a compilation would be needed and this is what we are trying to avoid!<br />
<br />
Here's how you can trap a certain Key Event of YourWidget (let's say it is the $ key...):<br />
<br />
<div style="color: #f6b26b;">
void YourWidget::keyPressEvent(QKeyEvent *event) {</div>
<div style="color: #f6b26b;">
[...] </div>
<div style="color: #f6b26b;">
case Qt::Key_Dollar:
<br />
{
</div>
<div style="color: #f6b26b;">
// We access the file and load it<br />
QFile
css_file("/path/to/your/css/file/gui.css");
<br />
css_file.open(QFile::ReadOnly);
<br />
QTextStream css_stream(&css_file);
<br />
QString css_string(css_stream.readAll());
<br />
css_file.close();
</div>
<div style="color: #f6b26b;">
<br /></div>
<div style="color: #f6b26b;">
// We apply the CSS file<br />
<span style="color: #cc0000;">setStyleSheet(css_string);
</span><br />
<span style="color: #cc0000;"> setStyle(QApplication::style());</span>
<br />
break;
<br />
}
</div>
<div style="color: #f6b26b;">
<br /></div>
<div style="color: #f6b26b;">
}</div>
<br />
Good Luck!<br />
<br /></div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
[Posted by PhS]</div>
<div style="font-family: "Helvetica Neue",Arial,Helvetica,sans-serif;">
<br /></div>Picviz Labshttp://www.blogger.com/profile/04378166596319186292noreply@blogger.com1tag:blogger.com,1999:blog-967889686247414271.post-80352539708289507712012-03-14T10:07:00.002+01:002012-03-14T10:22:27.214+01:00Our CanSecWest 2012 slides on passive DNS and Picviz<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-6c_GEjfSIBo/T2BdiyPshYI/AAAAAAAAABI/59VVdNkflIU/s1600/ADST-Cansec12.JPG" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="211" src="http://2.bp.blogspot.com/-6c_GEjfSIBo/T2BdiyPshYI/AAAAAAAAABI/59VVdNkflIU/s320/ADST-Cansec12.JPG" width="320" /></a></div>
Alexandre Dulaunoy from <a href="http://circl.lu/">CIRCL.LU</a> and Sebastien Tricaud from <a href="http://www.picviz.com/">Picviz Labs </a>have been talking at <a href="http://www.cansecwest.com/">CanSecWest</a> 2012 in Vancouver, Canada, on how to scrutinize a country using passive DNS and Picviz.<br />
<br />
It has been a great conference, and the opportunity was taken to share a joint research project on how to run passive dns services and rank bgp AS on one side, and analyze huge datasets using Picviz and its visualization on the other side.<br />
<br />
The slides <a href="http://www.picviz.com/docs/slides/picviz-pdns.pdf">are available here for download</a>. Enjoy!<br />
<br />
<br />Picviz Labshttp://www.blogger.com/profile/04378166596319186292noreply@blogger.com1tag:blogger.com,1999:blog-967889686247414271.post-18458664338342400312012-01-26T14:04:00.000+01:002012-01-26T22:36:23.654+01:00Syrian Bluecoat logs analysis - part 1Back to October 2010, Telecomix released 54 Gb of compressed BlueCoat SG-9000 logs (7 out of 15 proxies) covering the period from 2011 July 22nd to 2011 August 5th. Logs can be grabbed from <a href="http://tcxsyria.ceops.eu/95191b161149135ba7bf6936e01bc3bb">http://tcxsyria.ceops.eu/95191b161149135ba7bf6936e01bc3bb</a> .<br />
<br />
Having such logs is really cool, because there aren't much free logs available out there. I mean, real <b>and </b>usable logs (not just logs containing attacks nor normal traffic, but both). People are still writing papers <a href="http://www.ll.mit.edu/mission/communications/ist/corpora/ideval/docs/index.html">using old DARPA dataset from 1998</a>!<br />
<br />
This is a great way for us to demonstrate our technology, as <a href="http://www.picviz.com/sections/products/ASPI-analysis-stations.html">Picviz Inspector</a> is able to handle big log data analysis. As we've found some cool stuff during a quick analysis (the whole process took about thirty minutes) we think it is worth sharing it.<br />
<br />
<h3>
Computer used for the analysis</h3>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-1aOInTLztt4/TyCLUCKGaGI/AAAAAAAAAAM/emreTy9EZPY/s1600/ASPI-196.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="240" src="http://2.bp.blogspot.com/-1aOInTLztt4/TyCLUCKGaGI/AAAAAAAAAAM/emreTy9EZPY/s320/ASPI-196.png" width="320" /></a></div>
We've used our ASPI L 192 station, which is made of two Intel Xeon 2.66GHZ CPU that have 12 cores each. 12 RAM strips of 16Gb each and two graphic cards: one nVidia Quadro 5000 and one nVidia Tesla C2050.<br />
<br />
This is a great machine to compile your code in record time :-)<br />
<br />
We need such a machine because we want big data visualization with interactivity.<br />
<br />
<br />
<br />
<h3>
Data overview</h3>
$ file SG_main__420722212535.log<br />
SG_main__420722212535.log: ASCII text, with very long lines, with CRLF line terminators<br />
<br />
When looking at the data, raw files show things like (just 2 events):<br />
2011-07-22 20:34:51 282 ce6de14af68ce198 - - - OBSERVED "unavailable" http://www.surfjunky.com/members/sj-a.php?r=44864 200 TCP_NC_MISS GET text/html http www.surfjunky.com 80 /members/sj-a.php ?r=66556 php "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.65 Safari/534.24" 82.137.200.42 1395 663 -<br />
2011-07-22 20:34:51 216 6154d919f8d56690 - - - OBSERVED "unavailable" http://x31.iloveim.com/build_3.9.2.1/comet.html 200 TCP_NC_MISS GET text/html;charset=UTF-8 http x31.iloveim.com 80 /servlets/events ?1122064400327 - "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.18) Gecko/20110614 Firefox/3.6.18" 82.137.200.42 473 1129 -<br />
<br />
When formatted properly, one event looks like:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-T9LjqRL2UKw/TyFOj_dgrII/AAAAAAAAAA4/12wJXxbBcBs/s1600/normalized-bluecoat.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="174" src="http://3.bp.blogspot.com/-T9LjqRL2UKw/TyFOj_dgrII/AAAAAAAAAA4/12wJXxbBcBs/s640/normalized-bluecoat.png" width="640" /></a></div>
<br />
You can see 25 dimensions per event, some are empty, some have been replaced (c-ip) with a hash value to avoid finding real guys offending the government by some random people doing the analysis!<br />
<br />
We open the log with in Picviz using our Rapid Log Acquisition, <a href="http://www.picviz.com/datasheet/whitepaper-picviz-rla.pdf">as you can see in this whitepaper</a>.<br />
While focusing on one field may be cool to establish a top 10 in a pie chart as you can see there: <a href="http://hellais.github.com/syria-censorship/">http://hellais.github.com/syria-censorship/</a>, it is insufficient to have a global and detailed view of those logs, <b>all those logs</b>.
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-9SzG-_lhw6Q/TyCUJaD-b6I/AAAAAAAAAAU/ypljIHII5KM/s1600/PC-cartesian.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="174" src="http://3.bp.blogspot.com/-9SzG-_lhw6Q/TyCUJaD-b6I/AAAAAAAAAAU/ypljIHII5KM/s320/PC-cartesian.png" width="320" /></a></div>
As Parallel Coordinates is the only technique that can plot such large data with so many dimensions without letting the user away from them (via a top/least-something or only looking at a maximum of three dimensions), we have decided to plot them in Picviz so we could start looking at them and quickly find cool stuff into it.<br />
<br />
If you want more information on Parallel Coordinates, I recommend you to go <a href="http://www.juiceanalytics.com/writing/parallel-coordinates/">read this page</a>.<br />
<br />
As this is a rather quick analysis (writing this blog post takes more time!), there will be of course more articles of stuff we can extract from those logs, but that will not cover the basics and it will come in other parts.<br />
<br />
First of all, let's have a look at those data:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://1.bp.blogspot.com/-EyZbOJHhXmY/TyCVPZsWLSI/AAAAAAAAAAc/JUL37yRfLqE/s1600/syrian-proxy-firstlook.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="122" src="http://1.bp.blogspot.com/-EyZbOJHhXmY/TyCVPZsWLSI/AAAAAAAAAAc/JUL37yRfLqE/s400/syrian-proxy-firstlook.png" width="400" /></a></div>
<br />
We have here the global structure of the data. Some dimensions have been removed as they were all empties in the log file: <b>cs-username</b>, <b>cs-auth-group</b> and <b>x-virus-id</b>. Some other were added (splitting the <b>cs(Referer)</b> field in 6 dimensions to have a better understanding of the Referer URL (<b>protocol</b>, <b>domain </b>only, <b>TLD</b>, <b>port</b>, <b>URL</b>, <b>variable </b>added to the URL).<br />
<br />
<h3>
Data analysis</h3>
<b>Tracking Zeus</b><br />
<br />
Zeus is a rather famous botnet. For more information on Zeus, you can read what had been written by the Polish CERT.<br />
<br />
First, let's have a look at Zeus domains, using the regular expression <a href="http://www.cert.pl/news/4711/langswitch_lang/en">defined by the excellent Polish CERT</a>:<br />
<blockquote class="tr_bq">
[a-z0-9]{32,48}\.(ru|com|biz|info|org|net)</blockquote>
This gives the following selection:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-989YP3pSBfc/TyCY0tjfyfI/AAAAAAAAAAk/JyXOubqk5s0/s1600/full.log_text_bluecoat.sgos54_000.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="245" src="http://2.bp.blogspot.com/-989YP3pSBfc/TyCY0tjfyfI/AAAAAAAAAAk/JyXOubqk5s0/s400/full.log_text_bluecoat.sgos54_000.png" width="400" /></a></div>
<br />
Interestingly, we can see that in this period of time, only one user is affected, the expression matches the following four domains:<br />
<ul>
<li>df600de61d94e3e43300a2160d3d72f4.info</li>
<li>ebook.howtoviewprivatefacebookprofiles.com</li>
<li>howtoviewprivatefacebookprofiles.com</li>
<li>www.effectivetimemanagementstrategies.com</li>
</ul>
As for the c-ip record, they all match the IP "0.0.0.0" and not the user hash we are aware of in most of the log.<br />
<br />
<b> Finding funny User Agents</b><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://4.bp.blogspot.com/-FKravG-Pw7M/TyFecmvDY-I/AAAAAAAAABA/eIf6_ZHS9lc/s1600/frequency-filter.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="200" src="http://4.bp.blogspot.com/-FKravG-Pw7M/TyFecmvDY-I/AAAAAAAAABA/eIf6_ZHS9lc/s200/frequency-filter.png" width="181" /></a></div>
The User Agents dimension is always full of surprises. We decide to apply a filter on its frequency of appearance<b> using the log function</b> in order to separate the small values clearly from the other. <br />
<br />
When working with sorted uniques values, we've got a lot of cool stuff. The list is about 50k entries. Things that could look like a parser issue have been double checked and they are not. That are the real user agent that have been placed there, as the other fields have been filled correctly. Among the stuff that we enjoyed, we have:<br />
<br />
<ul>
<li>Mozila/4.0 (compatible; MSIE 5.0; LEAKCHECK) </li>
<li>%7BPRODUCT_NAME%7D/1.7.6 CFNetwork/485.13.8 Darwin/11.0.0</li>
<li>%D8%B1%D8%B3%D8%A7%D8%A6%D9%84%20%D8%A7%D9%84%D8%AD%D8%A8/1.1.0 CFNetwork/485.13.9 Darwin/11.0.0</li>
<li>Microsoft(r) Windows(tm) FTP Folder</li>
<li>'%22()&%1<ScRiPt >prompt(953201)</ScRiPt></li>
<li>QSP 196:3[0] R{81388-}</li>
<li>䚰�’s://ieframe.dll/background_gradient.jpg</li>
<li>1pB4kE1pB1m1wnG882g5_sxigw002284sn0k85gzEjBARMTEuMC4yLjU1Ng==</li>
</ul>
We filtered the last one to understand what kind of request could generate something that looked like (but isn't) base64 encoded stuff or a random hash value. First, we though it could be covered channels. It isn't. We've found this:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://1.bp.blogspot.com/-Mj6eAEmfpyA/TyCdQ095m7I/AAAAAAAAAAs/U4X-L-c1PJY/s1600/full.log_text_bluecoat.sgos54_001.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="123" src="http://1.bp.blogspot.com/-Mj6eAEmfpyA/TyCdQ095m7I/AAAAAAAAAAs/U4X-L-c1PJY/s400/full.log_text_bluecoat.sgos54_001.png" width="400" /></a></div>
<br />
And with the associated data (one event in 645):<br />
2011-08-02,11:21:23,34,0.0.0.0,-,-,-,OBSERVED,unavailable,,-,,,,{NULLCHAR}00,TCP_HIT,GET,application/octet-stream,http,dnl-18.geo.kaspersky.com,80,/index/u0607g.xml.dif,-,dif,1pBqgBumBovkhvCgvk6rx6ssywkr9qo0115t2w0oCUARMTEuMC4xLjQwMA==,82.137.200.42,774,272,-<br />
<br />
All those different values were associated with the domain ".{3}-\d+.geo.karsperky.com". We wonder why such a user agent is being used.<br />
<br />
<br />
<h3>
Conclusion </h3>
This is a first attempt to analyze globally this large volume of logs. It is very fortunate for log analysts to have such a great resource. We would like to thank Telecomix for sharing this. It is great to see how the Picviz approach to those data can be successful to find stuff quickly. Stuff we were not looking for.<br />
<br />
We will share more analysis on this blog in the future, you will see some interesting domain names that are being blocked at the moment (live.com, yahoo mail etc.) by the Syrian regime. And as we have finally the pleasure to work interactively with so much data and dimensions, we will of course find interesting stuff we are not aware of at the moment.<br />
<br />
If you have any comments, feedback and questions, do not hesitate as it can help us to improve the following articles.<br />
<br />Picviz Labshttp://www.blogger.com/profile/04378166596319186292noreply@blogger.com1tag:blogger.com,1999:blog-967889686247414271.post-77575058276941088032012-01-25T23:57:00.001+01:002012-01-25T23:57:22.035+01:00Picviz Labs blog grand opening!Welcome to our fresh new blog where we will post life at Picviz Labs, along with cool data analysis!Picviz Labshttp://www.blogger.com/profile/04378166596319186292noreply@blogger.com0