86 lines
4.4 KiB
Markdown
86 lines
4.4 KiB
Markdown
|
Content_Type: blog
|
||
|
Title: It's a post about nothing!
|
||
|
Date: 2022 7 1
|
||
|
|
||
|
The progress update
|
||
|
|
||
|
<p style='text-align: center;'>
|
||
|
<img src="https://old.doordesk.net/pics/plates.gif" />
|
||
|
</p>
|
||
|
|
||
|
### Bots
|
||
|
|
||
|
After finding a number of ways not to begin the project formerly known as my capstone,
|
||
|
I've finally settled on a
|
||
|
[dataset](https://www.kaggle.com/datasets/bwandowando/ukraine-russian-crisis-twitter-dataset-1-2-m-rows). The project is about detecting bots, starting with twitter. I've
|
||
|
[studied](https://old.doordesk.net/projects/bots/docs/debot.pdf) a
|
||
|
[few](https://old.doordesk.net/projects/bots/docs/botwalk.pdf)
|
||
|
[different](https://old.doordesk.net/projects/bots/docs/smu.pdf)
|
||
|
[methods](https://old.doordesk.net/projects/bots/docs/div.pdf) of bot detection and particularly like the
|
||
|
[DeBot](https://old.doordesk.net/projects/bots/docs/debot.pdf) and
|
||
|
[BotWalk](https://old.doordesk.net/projects/bots/docs/botwalk.pdf) methods and think I will try to mimic them,
|
||
|
in that order.
|
||
|
|
||
|
Long story short, DeBot uses a fancy method of time correlation to group accounts
|
||
|
together based on their posting habits. By identifying accounts that all have identical
|
||
|
posting habits that are beyond what a human could do by coincidence, this is a great
|
||
|
first step to identifying an inital group of seed bots. This can then be expanded by
|
||
|
using BotWalk's method of checking all the followers of the bot accounts and comparing
|
||
|
anomalous behavior to separate humans from non-humans. Rinse and repeat. I'll begin this
|
||
|
on twitter but hope to make it platform independent.
|
||
|
|
||
|
### The Real Capstone
|
||
|
|
||
|
The bot project is too much to complete in this short amount of time, so instead I'm
|
||
|
working with a
|
||
|
[small dataset](https://archive-beta.ics.uci.edu/ml/datasets/auto+mpg)
|
||
|
containing info about cars with some specs and I'll predict MPG. The problem itself for
|
||
|
me is trivial from past study/experience as an auto mechanic so I should have a nice
|
||
|
playground to focus completely on modeling. It's a very small data set too at < 400
|
||
|
lines, I should be able to test multiple models in depth very quickly. It may or may not
|
||
|
be interesting, expect a write-up anyway.
|
||
|
|
||
|
### Cartman
|
||
|
|
||
|
Well I guess I've adopted an 8 year old. Based on
|
||
|
[this project](https://github.com/RuolinZheng08/twewy-discord-chatbot)
|
||
|
I've trained a chat bot with the personality of Eric Cartman. He's a feature of my
|
||
|
Discord bot living on a Raspberry Pi 4B, which I would say is probably the slowest
|
||
|
computer you would ever want to run something like this on. It takes a somewhat
|
||
|
reasonable amount of time to respond, almost feeling human if you make it think a bit.
|
||
|
The project uses [PyTorch](https://pytorch.org/) to train the model. I'd like
|
||
|
to re-create it using [TensorFlow](https://www.tensorflow.org/) as an
|
||
|
exercise to understand each one better, but that's a project for another night. It also
|
||
|
only responds to one line at a time so it can't carry a conversation with context,
|
||
|
yet...
|
||
|
|
||
|
### Website
|
||
|
|
||
|
I never thought I'd end up having a blog. I had no plans at all actually when I set up
|
||
|
this server, just to host a silly page that I would change from time to time whenever I
|
||
|
was bored. I've been looking at
|
||
|
[Hugo](https://gohugo.io/) as a way to organize what is now just a list of
|
||
|
divs in a single html file slowly growing out of control. Basically you just dump each
|
||
|
post into its own file, create a template of how to render them, and let it do its
|
||
|
thing. I should be able to create a template that recreates exactly what you see right
|
||
|
now, which is beginning to grow on me.
|
||
|
|
||
|
If you haven't noticed yet, (and I don't blame you if you haven't because only a handful
|
||
|
of people even visit this page) each time there is an update there is a completely new
|
||
|
background image, color scheme, a whole new theme. This is because this page is a near
|
||
|
identical representation of terminal windows open my computer and each time I update the
|
||
|
page I also update it with my current wallpaper, which generates the color scheme
|
||
|
dynamically using
|
||
|
[Pywal](https://github.com/dylanaraps/pywal).
|
||
|
|
||
|
TODO:
|
||
|
* Code blocks with syntax highlighting
|
||
|
* Develop an easy workflow to dump a jupyter notebook into the website and have it display nicely with minimal effort
|
||
|
* Find a way to hack plots generated with matplotlib to change colors with the page color scheme (or find another way to do the same thing)
|
||
|
* Automate generating the site - probably [Hugo](https://gohugo.io/)
|
||
|
* Separate from blog, projects, etc.
|
||
|
* Add socials, contact, about
|
||
|
* A bunch of stuff I haven't even thought of yet
|
||
|
|
||
|
That's all for now
|