Filtering by
- All Subjects: Technology
- All Subjects: Python
- Creators: Computer Science and Engineering Program
- Member of: Theses and Dissertations
Generating an astounding $110.7 billion annually in domestic revenue alone [1], the world of accounting is one deceptively lacking automation of its most business-critical processes. While accounting tools do exist for the common person, especially when it is time to pay their taxes, such innovations scarcely exist for many larger industrial tasks. Exceedingly common business events, such as Business Combinations, are surprisingly manual tasks despite their $1.1 trillion valuation in 2020 [2]. This work presents the twin accounting solutions TurboGAAP and TurboIFRS: an unprecedented leap into these murky waters in an attempt to automate and streamline these gigantic accounting tasks once entrusted only to teams of experienced accountants.
A first-to-market approach to a trillion-dollar problem, TurboGAAP and TurboIFRS are the answers for years of demands from the accounting sector that established corporations have never solved.
"Generating an astounding $110.7 billion annually in domestic revenue alone [1], the world of accounting is one deceptively lacking automation of its most business-critical processes. While accounting tools do exist for the common person, especially when it is time to pay their taxes, such innovations scarcely exist for many larger industrial tasks. Exceedingly common business events, such as Business Combinations, are surprisingly manual tasks despite their $1.1 trillion valuation in 2020 [2]. This work presents the twin accounting solutions TurboGAAP and TurboIFRS: an unprecedented leap into these murky waters in an attempt to automate and streamline these gigantic accounting tasks once entrusted only to teams of experienced accountants.
A first-to-market approach to a trillion-dollar problem, TurboGAAP and TurboIFRS are the answers for years of demands from the accounting sector that established corporations have never solved."
Generating an astounding $110.7 billion annually in domestic revenue alone [1], the world of accounting is one deceptively lacking automation of its most business-critical processes. While accounting tools do exist for the common person, especially when it is time to pay their taxes, such innovations scarcely exist for many larger industrial tasks. Exceedingly common business events, such as Business Combinations, are surprisingly manual tasks despite their $1.1 trillion valuation in 2020 [2]. This work presents the twin accounting solutions TurboGAAP and TurboIFRS: an unprecedented leap into these murky waters in an attempt to automate and streamline these gigantic accounting tasks once entrusted only to teams of experienced accountants.
A first-to-market approach to a trillion-dollar problem, TurboGAAP and TurboIFRS are the answers for years of demands from the accounting sector that established corporations have never solved.
Throughout this project, I decided on a number of learning goals to consider it a success. I needed to learn how to use the supporting libraries that would help me to design this system. I also learned how to use the Twitter API, as well as create the infrastructure behind it that would allow me to collect large amounts of data for machine learning. I needed to become familiar with common machine learning libraries in Python in order to create the necessary algorithms and pipelines to make predictions based on Twitter data.
This paper details the steps and decisions needed to determine how to collect this data and apply it to machine learning algorithms. I determined how to create labelled data using pre-existing Botometer ratings, and the levels of confidence I needed to label data for training. I use the scikit-learn library to create these algorithms to best detect these bots. I used a number of pre-processing routines to refine the classifiers’ precision, including natural language processing and data analysis techniques. I eventually move to remotely-hosted versions of the system on Amazon web instances to collect larger amounts of data and train more advanced classifiers. This leads to the details of my final implementation of a user-facing server, hosted on AWS and interfacing over Gmail’s IMAP server.
The current and future development of this system is laid out. This includes more advanced classifiers, better data analysis, conversions to third party Twitter data collection systems, and user features. I detail what it is I have learned from this exercise, and what it is I hope to continue working on.