About Me

Seiji Maekawa

Seiji MAEKAWA
(前川 政司)

Research Scientist @ Megagon Labs, California, USA
Ph.D. (Information Science)

Email: seiji [at] megagon.ai

Research Interests

Natural Language Processing

  • Long-context Language Models
  • Retrieval-augmented Language Models
  • Active Learning

Graph Processing

  • Graph Neural Networks
  • Synthetic Graph Generation
  • Attributed Graph Clustering

Social Network Analysis

  • Follower Prediction
  • Graph Analysis on Incomplete Networks

Graph Database

  • Language-aware Indexing
  • Query Language

Experiences

Research Scientist

Megagon Labs Inc. / Mountain View, CA, USA
2024-04 ~ present

Research Associate

Megagon Labs Inc. / Mountain View, CA, USA
2023-04 ~ 2024-03

Research Intern

Megagon Labs Inc. / Mountain View, CA, USA
2022-01 ~ 2022-04
Low-budget active learning. Aiming to reduce labeling costs (i.e., human effort) by focusing on informative data samples to train language models. Blog post

Research Intern

Hotto Link Inc. - 株式会社ホットリンク / Remote
2020-09 ~ 2020-10
Follower prediction problem under the restriction of the small number of API calls. This work was press-released!

Specially Appointed Researcher/Fellow (Part time)

Osaka University
2020-04 ~ 2023-03

Sales Enginner (Full-time)

NTT DOCOMO, INC.
2019-04 ~ 2020-03

Study Abroad - Guest Student

Eindhoven University of Technology / Eindhoven, Netherlands
2018-10 ~ 2018-12

Study Abroad - Exchange Student

The Chinese University of Hong Kong / Hong Kong
2017-09 (30 days)

Publications

International Conferences/Workshops

Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data. [paper] [code] [data]
Seiji Maekawa*, Hayate Iso*, Nikita Bhutani.
in International Conference on Learning Representations (ICLR), Apr 2025.

From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization. [paper]
Catarina G. Belem, Pouya Pezeskhpour, Hayate Iso, Seiji Maekawa, Nikita Bhutani, Estevam Hruschka.
in Findings of 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2025.

Retrieval Helps or Hurts? A Deeper Dive into the Efficacy of Retrieval Augmentation to Language Models. [paper] [poster] [code]
Seiji Maekawa, Hayate Iso, Sairam Gurajada, Nikita Bhutani.
in 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) [oral presentation], acceptance rate: 23.2% (=565/2434), Jun. 2024.

Low-resource Interactive Active Labeling for Fine-tuning Language Models. [paper] [code]
Seiji Maekawa, Dan Zhang, Hannah Kim, Sajjadur Rahman and Estevam Hruschka.
in Findings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), Dec. 2022.

Beyond Real-world Benchmark Datasets: An Empirical Study of Node Classification with GNNs. [paper] [code]
Seiji Maekawa, Koki Noda, Yuya Sasaki, Makoto Onizuka.
in Proceedings of the NeurIPS Datasets and Benchmarks Track, Nov. 2022.

GNN Transformation Framework for Improving Efficiency and Scalability. [paper] [code]
Seiji Maekawa, Yuya Sasaki, George Fletcher, Makoto Onizuka.
in Proceedings of The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), Sep. 2022.

Benchmarking GNNs with GenCAT Workbench. [paper] [code] [demo video]
Seiji Maekawa, Yuya Sasaki, George Fletcher, Makoto Onizuka.
in Demo track of ECML/PKDD, Sep. 2022.

Effective Candidate Selection and Interpretable Interest Extraction for Follower Prediction on Social Media. [paper]
Seiji Maekawa, Santi Saeyor, Takeshi Sakaki, Makoto Onizuka,
in Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence (WI-IAT), December 2021.

Adaptive Node Embedding Propagation for Semi-Supervised Classification. [paper]
Yuya Ogawa, Seiji Maekawa, Yuya Sasaki, Yasuhiro Fujiwara, Makoto Onizuka.
in Proceedings of The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), Sep. 2021.

Controlling Internal Structure of Communities on Graph Generator. [paper]
Hiroto Yamaguchi, Yuya Ogawa, Seiji Maekawa, Yuya Sasaki, Makoto Onizuka.
in Proceedings of 2020 IEEE/ACM ASONAM Demos and Exhibitions Track, Dec. 2020.

General Generator for Attributed Graphs with Community Structure. [paper] [code]
Seiji Maekawa, Jianpeng Zhang, George Fletcher, Makoto Onizuka.
in Proceedings of the ECML/PKDD Graph Embedding and Mining Workshop, Sep. 2019.

Journal Papers

GenCAT: Generating Attributed Graphs with Controlled Relationships between Classes, Attributes, and Topology. [paper] [code]
Seiji Maekawa, Yuya Sasaki, George Fletcher, Makoto Onizuka.
Information Systems, Feb. 2023.

New Attributed Graph Clustering by Bridging Attribute and Topology Spaces. [paper] [code]
Seiji Maekawa, Koh Takeuchi, Makoto Onizuka.
Information Processing Society of Japan, Aug. 2020.

Domestic Conferences/Others - 国内会議・研究会等

多様な人工グラフを用いた GNN によるノード分類の実証研究.
前川 政司, 野田 昂希, 佐々木 勇和, 鬼塚 真.
第15回データ工学と情報マネジメントに関するフォーラム(DEIM Forum 2023), Mar. 2023.

コミュニティ構造を制御可能な属性付きグラフ生成. [abstract]
前川 政司, 佐々木 勇和, George Fletcher, 鬼塚 真.
情報処理学会第83回全国大会(IPSJ 2021), Mar. 2021.

適応的なノード埋め込みの伝搬による半教師ありノード分類モデル. [abstract]
小川 裕也, 前川 政司, 佐々木 勇和, 藤原 靖宏, 鬼塚 真.
情報処理学会第83回全国大会(IPSJ 2021), Mar. 2021.

時系列グラフにおける着目ノードに特化したリンク予測. [abstract]
山口 寛人, 前川 政司, 佐々木 勇和, 鬼塚 真.
情報処理学会第83回全国大会(IPSJ 2021), Mar. 2021.

コミュニティ構造を制御する属性付きグラフ生成. [pdf]
前川 政司, 佐々木 勇和, George Fletcher, 鬼塚 真.
第13回データ工学と情報マネジメントに関するフォーラム(DEIM Forum 2021), Mar. 2021.

半教師ありノード分類のための適応的ノード埋め込み伝搬ニューラルネットワーク. [pdf]
小川 裕也, 前川 政司, 佐々木 勇和, 藤原 靖宏, 鬼塚 真.
第13回データ工学と情報マネジメントに関するフォーラム(DEIM Forum 2021), March 2021.

時系列グラフを活用する着目ノードに特化したリンク予測. [pdf]
山口 寛人, 前川 政司, 佐々木 勇和, 鬼塚 真.
第13回データ工学と情報マネジメントに関するフォーラム(DEIM Forum 2021), Mar. 2021.

コミュニティ構造を考慮した属性付きグラフ汎用生成機構. [pdf]
前川 政司, George Fletcher, 鬼塚 真.
第11回データ工学と情報マネジメントに関するフォーラム(DEIM Forum 2019), Mar. 2019.

隣接性と構造類似性を考慮したグラフクラスタリング. [pdf]
小川 裕也, 前川 政司, 竹内 孝, 佐々木 勇和, 鬼塚 真.
第11回データ工学と情報マネジメントに関するフォーラム(DEIM Forum 2019), Mar. 2019.

属性付きグラフのための非線形関数を用いた接合加重非負値行列分解. [pdf]
前川 政司, 竹内 孝, 佐々木 勇和, 鬼塚 真.
第10回データ工学と情報マネジメントに関するフォーラム(DEIM Forum 2018), Mar. 2018.

Preprint

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-k. [paper]
Chihiro Taguchi, Seiji Maekawa, Nikita Bhutani
in arXiv preprint, June 2025.

A Simple and Scalable Graph Neural Network for Large Directed Graphs. [paper] [code]
Seiji Maekawa, Yuya Sasaki, Makoto Onizuka.
in arXiv preprint, June 2023.

Awards & Activities

Awards

  • DEIM2019 学生プレゼンテーション賞(2019年)[Link]
  • I-Scover 利活用コンテストにて学生賞を受賞(2017年)[成果物 Link]

Talks

  • NLP Colloquium, "Holistic Reasoning with Long-Context LMs" [link] [YouTube], May 2025
  • Tutorial, "グラフ深層学習のすゝめ" [link] [YouTube], DEIM 2023, Mar. 2023
  • 情処ラジオ [link] [YouTube], Feb. 2023

Community Service

  • I contributed an article regarding NeurIPS2022 to The Database Community of Japan (DBSJ) [link] (Japanese)
  • My Ph.D. thesis is featured in Information Processing Society of Japan🎉 [link] (Japanese)

Program Committee

  • 2025: ICLR, ARR May
  • 2024: ARR April & October, ECML PKDD AI4HR & PES Workshop, EACL NLP4HR Workshop, SDM, TKDE
  • 2023: IEEE BigData, CIKM, ACL Matching Workshop, KDD
  • 2022: NeurIPS D&B, ECML PKDD

Education

Ph.D. in Information Science and Technology

Osaka University / 2020-04 ~ 2023-03
Research: graph neural networks, synthetic graph generation

MS in Information Science and Technology

Osaka University / 2017-04 ~ 2019-03
Research: graph clustering, synthetic graph generation

BE in Informatics

Kyoto University / 2012-04 ~ 2016-09
Research: topic model, word embedding

Skills

Programming

  • Python3
    • Jupyter notebook
    • pytorch, sklearn, numpy, scipy, pandas, etc.
  • C++
  • SQL