Building a Text Summarizer With SBERT and Flask
In this tutorial, we will learn to build a flask web application that summarizes text using the Sentence-BERT model. <!--more--> Text summarization deals with the creation of sentence embeddings that supports over 100 languages. You can read more about Sentence-BERT here.
SBERT can also be used to compare the semantic similarity of words. When summarizing a lengthy text, it is critical to seek similarities between sentences to ensure that the summary is correct and does not distort the original text's meaning.
Prerequisites
To follow along with this tutorial, the reader must have the following:
- Basic knowledge of Python programming language. Here, we will use a Python version greater than 3.
- An IDE installed, preferably VS Code.
Build a flask web app
Sentence-BERT (SBERT), a siamese and triplet network-based variant of the BERT model is capable of deriving semantically meaningful sentence embeddings.
With SBERT, BERT got the additional capability to compare massive sets for semantic similarities, groups, and retrieve information via semantic search.
BERT established new benchmarks for performance on a variety of sentence categorization and pairwise regression problems.
Semantically related sentences can be identified using a similarity measure such as cosine similarity distance. Due to the high efficiency with which these similarity measures can be computed on modern technology, SBERT can be used for both semantic similarity search and clustering.
Create a virtual environment
Before we start, let's create a virtual environment. Open the terminal and create a virtual environment summarizerApp
as shown:
python3 -m venv summarizerApp
Then we activate the environment with:
source summarizerApp/bin/activate
Install packages
Flask
We use Flask to make web applications, manage HTTP requests, and render templates:
pip3 install Flask
summarizer
To fetch the most relevant and valuable information out of a lengthy document, we use summarizer
.
pip3 install summarizer
sentence-transformers
Python framework that uses state-of-the-art models for text and image embeddings creation.
pip3 install -U sentence-transformers
bert-extractive-summarizer
To do extractive summaries, we use the BERT extractive summarizer from the HuggingFace Pytorch transformers library:
pip3 install -q bert-extractive-summarizer
Build frontend
Inside the working directory, create a folder called templates
with two files inside it:
index.html
summary.html
In the index.html
file for the home page, we display a text field where the user can submit a textual content that is to be summarized:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Text Summarizer App</title>
<link
href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css"
rel="stylesheet"
integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3"
crossorigin="anonymous"
/>
</head>
<body>
<nav class="navbar navbar-light" style="background-color: #100da1;">
<div class="container">
<a class="navbar-brand">Summarizer</a>
</div>
</nav>
<form action="/summarize" method="post">
<div class="form-group">
<label for="exampleFormControlTextarea1" style="padding-top: 2em;">
<strong>Enter Your Text Below To Be Summarized:</strong>
</label>
<br />
<textarea
class="form-control"
id="exampleFormControlTextarea1"
rows="10"
name="data"
></textarea>
</div>
<br />
<button type="submit" class="btn btn-outline-primary">Summarize</button>
</form>
</body>
</html>
To display the text summary of the text we inputted, we create summary.html
as shown:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Your Summary</title>
<style>
p {
padding-top: 2em;
text-align: center;
line-height: 3em;
color: black;
word-spacing: 0.25em;
font-family: "Times New Roman", Times, serif;
}
</style>
<link
href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css"
rel="stylesheet"
integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3"
crossorigin="anonymous"
/>
</head>
<body>
<nav
class="navbar navbar-light"
style="background-color: hsl(236, 96%, 22%);"
>
<div class="container">
<a class="navbar-brand">Text SUMMARY</a>
</div>
</nav>
<p>{{ result }}</p>
</body>
</html>
Build backend
In your project folder, create a file named app.py
with the following content:
#importing flask
from flask import Flask, render_template,request
#Importing the summarizer
from summarizer import Summarizer
from summarizer.sbert import SBertSummarizer
The latest version of bert-extractive-summarizer
lets you use Sentence Bert
. You can read more about this library here.
After importing the libraries, we build a SBERT model to summarize the required content as shown below:
# Using an instance of SBERT to create the model
model = SBertSummarizer('paraphrase-MiniLM-L6-v2')
app = Flask(__name__)
@app.route("/")
def msg():
return render_template('index.html')
@app.route("/summarize", methods=['POST','GET'])
def getSummary():
body=request.form['data']
result = model(body, num_sentences=5)
return render_template('summary.html',result=result)
if __name__ =="__main__":
app.run(debug=True,port=8000)
In the above code:
- We use an instance of SBERT to create the model.
SBertSummarizer('paraphrase-MiniLM-L6-v2')
is a sentence-transformer model used for convert phrases and paragraphs into a 384-dimensional dense vector space.return render_template('index.html')
displays theindex.html
contents, which is our home page.return render_template('summary.html',result=result)
displays thesummary.html
data. In our case, it's the summary page.app.run(debug=True,port=8000)
runs on local host in port8000
, which communicates with the server.
Firstly, we render the index.html
at the start of the server. Then, on accepting the input from the form using request.form['data']
, we save it to body
and render the summary.html
along with the summarized results.
Run the application
In your project folder, we should have the following folders and files:
- The folder contains files installed during virtual environment creation.
- The
templates
folder. app.py
file containing the Python script.
Finally, we run the app using the command:
python app.py
In your terminal, the server starts up with a warning message, which can be ignored.
Now, you may search for the URL http://127.0.0.1:8000
to access our frontend.
Testing
Now, you may enter a text that you wish to be summarized and click on Summarize
button.
Conclusion
In this blog, we learned how to effectively construct a Flask web application that utilizes SBERT to summarize a text.
To start with, we created a virtual environment, installed the packages, coded both the front-end and back-end of our web application, and finally launched it.
You can find the code for this tutorial here.
Happy coding!
References
- Build, save, and deploy your first web App using Flask.
- Develop an NLP Model in Python & deploy it with Flask.
- Automated news summarization with BERT-powered encoders
Peer Review Contributions by: Srishilesh P S